Google+ Sign In for Express

We’ve recently published a node.js client library to let the node community to talk to Google APIs in a more pleasant way. As a quick starter demo, I’ve implemented a middleware for Connect that helps you to easily add Google+ Sign In to your Connect-powered projects (such as express web apps) with a few lines of code.

screenshot

The sample middleware is called plussignin and available on burcu/node-plussignin. Assume you have an existing express project or creating a new one. There are two additional steps to enable Google+ sign-in:

  • Add express.cookieParser, express.session and plussignin middlewares to your app.
  • Configure plussignin with your client ID, client secret, redirect URI and required scopes. (Client ID and secret are available on API Console.

plussignin will add the following routes to your application and will handle OAuth 2.0 flow for your app automagically.

  • /login: Redirects user to the authorization dialog and ask for confirmation.
  • /logout: Will remove the user and his/her profile from the session, logs the user out.
  • /pluscallback: When a user grants permissions to your app on the auth dialog, this end-point will be hooked. plussign will exchange tokens with Google OAuth 2.0 endpoints to retrieve an access token. Once an access token is acquired, it will make a request to retrieve user’s profile. Once this flow is executed successfully, it will put the user and user’s profile into the session and redirect the logged-in user to the homepage.
  • /error: If an error occurs during OAuth 2.0 flow, user will be redirected to /error.

The following snippet illustrates a sample usage.

var express = require('express'),
    plussignin = require('./plussignin.js'),
    MemoryStore = express.session.MemoryStore,
    app = express();

var CLIENT_ID = 'YOUR_CLIENT_ID_HERE',
    CLIENT_SECRET = 'YOUR_CLIENT_SECRET_HERE',
    REDIRECT_URI = 'http://localhost:3000/pluscallback',
    SCOPES = [
      'https://www.googleapis.com/auth/plus.login'];

app.use(express.cookieParser('something secret'));
app.use(express.session({ secret: 'yet another secret', store: new MemoryStore() }));
app.use(plussignin({ clientId: CLIENT_ID, clientSecret: CLIENT_SECRET, redirectUri: REDIRECT_URI, scopes: SCOPES }));

// renders the homepage
app.get('/', function(req, res) {
  res.render('index', { plus: req.plus });
});

app.listen(3000);
console.log('Im listening you on port 3000...');

Some more good news: req objects are extended with several utilities.

  • req.plus.isLoggedIn is true, if there is a user in the session.
  • req.plus.oauth2 is a googleapis.OAuth2Client.
  • req.plus.profile is user profile object.
  • req.plus.people.get({ userId: '' }); returns a regular googleapis Request object.

Note: Many asked why this is not a module. Answer: It’s not prod ready. I’m willing to clean it up, provide some other essential features, and release it as a module.

Cluster-based Recommendation with Mahout

Mahout includes a few new experimental recommenders that are weakly documented at the moment. One of them is TreeClusteringRecommender which clusters your model into a set of groups and makes recommendations based on distances between your users and items in these clusters.

A clustered-based recommendation may be a good choice if your data is sparse and rarely correlated to find obvious patterns. Another advantage is clustering may help you to provide recommendations to users even with very tiny data available. Yet, it decreases the level of personalization and output is not unique to users but to clusters.

Here is a quick start to create and run one:

UserSimilarity similarity = new LogLikelihoodSimilarity(model);
ClusterSimilarity clusterSimilarity = new FarthestNeighborClusterSimilarity(similarity);
Recommender rec = TreeClusteringRecommender(model, clusterSimilarity, 20);

rec.getCluster(1); // gets the cluster of userId=1
rec.recommend(1, 10); // recommends 10 items to userId=1

What about user similarity and cluster similarity?

  • You basically have to provide a similarity function to be able to measure distances between different users. You may like to represent each user with a vector and calculate euclidean, Pearson, cosine, etc.  distance by looking at these features. Or LogLlikelihoodSimilarity may work as well. You may want to look at Surprise and Coincidence to understand what’s under the hood of this similarity.
  • Cluster similarity is newly represented here. It’s a place to customize the measurement of the similarity between two clusters. There are already two implementations: NearestNeighborClusterSimilarity and FarthestNeighborClusterSimilarity. Beware that clusters are dynamic. As new data comes in, old clusters may be merged and new ones may be introduced.

And the rest is about experimental work to find a method fit for your data by analyzing the nature of your input,  plotting the results and evaluating. The initial clustering takes a lot of time compared to item or user similarity based recommenders of Mahout, yet it works to be OK online once you start working with pre-computed data. Even it’s not very mature, for a primitive start, you may still like to consider this recommender if what you want to achieve fits in clustering.

The Rise of the Open Science

Open science is opening the way we make science. It stands for transparency and public accessibility of scientific data, collaboration, methods and results. On the other hand, it supports the existence of public contribution to the current state of science, and giving it back to the public domain.

Motivation

While we are making science, we rely on the older publications and methods those are often published with no open access to data. Years ago, academic community skeptically started to question the credibility of the research work on the existing literature. The way that science is funded was one of chief reasons behind this question. Science made with non-open data had possibility to be easily led by politics and other funding authority such as private companies to mislead the facts such as global warming or medical side-effects of a new medicine. Firing up an openness discussion led another ideas such as opening the methods and scientific source code.

Why to open data, open tools and open results?

One of the core values of science was being open and accessible. But ironically science is today receive heavy financial support from private institutions and governments where much of the budgets are shaped by economical, industrial or military needs. Scientific institutions are mostly closed to people without PhDs for scientist roles because there is already a huge competition among PhDs. Our credibility is measured by the number of papers published and number of citations we receive. I wouldn’t want to slander scientists but professional science, as in its own closed ecosystem, has a few conflicts against the key foundations of science. Science’s route, subject, people and results are controlled or may have possibility of being controlled by authority. In next few decades, we have to reissue the way we sustain  science.

We also do have a verification problem with science that relies on data. Computational and statistical science is lacking in reproducing the final results advertised on publications. JASA (Journal of the American Statistical Association) reports that only 21% of the papers are being published with source open in 2011, still a positive number compared to 2006’s 9%. Without code or data, even the work is published on an academic journal, there is no way to validate or iterate over the existing founding.

One of the key problems as we can address is that scientific research is not maintainable without economical sustainability due to the need of scientific tools. I’ve watched Eri Gentry, the founder of BioCurious, at OSCON last year. Her key points about opening the scientific tools, in the self-makers’ vision was motivating. According to her, at some point at BioCurious, they needed to have a PCR machine that was costing several thousand dollars to keep their garage based research on. Since they can’t afford the machine, they decided to analyze how they are actually working. Fortunately, they’ve figured out what it’s about and created OpenPCR. And now you are able to copy some strawberry DNA sequence or make cancer research at home. An open repository of knowledge on making scientific tools will increase the level of collaboration from regular makers and DIY people who may never have chance to investigate or be able to reverse engineer these tools.

Collaborative Science

By the radical changes in means of communication, discovery and discussion will have to change radically as well. A few months ago, I’ve seen a book by Michael Nielsen called Reinventing Discovery: The New Era of Networked Science on the new arrivals section. Nielsen opens the first chapter by a 2009 story about Tim Gowers‘ Polymath Project. Tim Gowers is a very notable mathematician, a Fields medalist from Cambridge University. In 2009, instead of working alone or with his existing pairs, he decided to discuss a mathematical problem on his blog and asked for readers to share their ideas online. In 6 weeks, he received 800 comments from 27 people. Although start has a its pitfalls, 37 days later Gowers announced they have not just solved his problem but the generalization of the polymaths problem including a special case.

And what about citizen science? Citizen science is used to be perceived as a more pro way of scientific crowd sourcing. But this perception seems to be changing. Very recently, I had a few discussions with friends who are totally aliens for citizen science and its current initiatives. They preliminary questioned the need of citizen scientists. Our main talk was about classification of galaxies on GalaxyZoo. GalaxyZoo is an online tool that shows you images of galaxies taken by Hubble telescope and wants you to manually choose if galaxy is elliptical or spiral or it has some set of features or not. Any programmer would initially ask why we are doing this classification manually in 2010s. Honestly, we have technology to pick up the features directly from signal without any observation from a human eye. So? But, discovery is not classification. We actually don’t know what we are looking at. Any anomalies or any strange looking objects would be a new scientific discovery. By reviewing the existing images, GalaxyZoo members discovered a new type of galaxies, now we call them “pea galaxies” and Hanny van Arkel, a Dutch school teacher, discovered a green strange nebula-looking object in the size of the Milky Way Galaxy called Hanny’s Voorwerp again in 2007.

So, why aren’t we taking it any further? There is an ongoing afford to make a cultural shift to increase the awareness and participation into science. Not only Zooniverse projects but NASA has opened code.nasa.gov very recently. Ariel Waldman is keeping a dictionary of all citizen space exploration projects on spacehack.org for a while. LHC’s ongoing CMS project donated data to Science Hack Day participants to let data hackers come up with data visualization tools for CMS. DIYgenomics are crowd sourcing genomic data. The list goes on…

Conclusion

With the ongoing momentum in scientific communities, in the next few decades, we’ll experience a tremendous change in they way we make and participate in science. For now, not intercepting conventional means but creating possibilities, new science is approaching with the strong sympathy for making scientific results freely and universally accessible.

Android’s RTP implementation

Although still being not really mature, Android is supporting RTSP streaming for a long time. In theory, it’s very trivial to play an RTSP link with MediaPlayer controller.

MediaPlayer player = new MediaPlayer("rtsp://...");
player.prepare();
player.start();

But in practice, MediaPlayer implementation is not fair enough to give you responses and you basically dont know what’s going on since your media is not playing. I will be generally talking about network layer, so you will have a basic idea how to configure your media servers.

RTSP and RTP

Generally we call it RTSP. But RTSP streaming has two phases: RTSP and mostly RTP to transform actual media data. RTSP is a stateful protocol. While making the first connection, it agrees on a bunch of details and exchanges data about the media being served between client and server. These are done with a family of directives.  These directives are sent on TCP 554. The RTSP flow includes OPTIONS, DESCRIBE, SETUP, PLAY/PAUSE/etc. On the request made for SETUP directive, client specifies what transform protocol it’ll support (in this case, it’s RTP) and on which protocol and which port. Android clients choose UDP and a range starting form 15000 to 65k. This range may change from phone to phone, manufacturer to manufacturer. Summary: There is absolutely no standard at  all. If you look at native MediaPlayer implementation in Android codebase, you will see no specific range as well. So, it’s very likely for you to have trouble. Another bad point is, RTP is usually supported on a port range between 9k-15k on TCP (e.g. Blackberry devices). And if you read tips and tricks about configuring a server, you won’t be able to catch the Android fact.

Note: This post was a draft for about a year, I reviewed it and posted. There’s nothing over-dated according to my practical knowledge. If you are against me, contact me for fixes.

Edit: Some phones fallback to TCP based streaming when UDP is not available.

Setting bounds of a map to cover collection of POIs on Android

Lately, as I browse web for maps related questions on Android, what’s frequently requested is an example of setting bounds of a map (zooming to a proper level and panning) to be able show all of the pins given on the screen.

Most of the maps APIs provide this functionality such as Google Maps API, so developers seem to have problems with implementing theirs. Google Maps API for Android does not provide functionality for setting bounds to a box. Instead, what’s provided is to zoom to a span.

com.google.android.maps.MapController.zoomToSpan(int latSpanE6, int lonSpanE6)

latSpanE6 is the difference in latitudes * 10^6 and similarly lonSpanE6 is the difference longitude * 10^6. You may question how map controllers know where to zoom in just by the differences. For examples, kms between longitudes differ from equator to poles. Fortunately, Google maps projection has them in the same length. This may remind you the infamous South America versus Greenland syndrome. Although Greenland is much much smaller than South America, it doesnt look so with this map projection.

On the below, I implemented a boundary arranger method for MapView. Method takes three arguments: items, hpadding and vpadding. items as you may guess is a list of POIs. Other arguments are a little bit more interesting. hpadding and vpadding is the percentage of padding you would like to leave horizontally and vertically so that pins don’t appear just on the corners. For instance, if you assign 0.1 for hpadding, 10% padding will be given from top and bottom.

BTW, You’ll have to extend the existing MapView and add this method to your own MapView to use this method properly.

public void setMapBoundsToPois(List<GeoPoint> items, double hpadding, double vpadding) {

    MapController mapController = this.getController();
    // If there is only on one result
    // directly animate to that location

    if (items.size() == 1) { // animate to the location
        mapController.animateTo(items.get(0));
    } else {
        // find the lat, lon span
        int minLatitude = Integer.MAX_VALUE;
        int maxLatitude = Integer.MIN_VALUE;
        int minLongitude = Integer.MAX_VALUE;
        int maxLongitude = Integer.MIN_VALUE;

        // Find the boundaries of the item set
        for (GeoPoint item : items) {
            int lat = item.getLatitudeE6(); int lon = item.getLongitudeE6();

            maxLatitude = Math.max(lat, maxLatitude);
            minLatitude = Math.min(lat, minLatitude);
            maxLongitude = Math.max(lon, maxLongitude);
            minLongitude = Math.min(lon, minLongitude);
        }

        // leave some padding from corners
        // such as 0.1 for hpadding and 0.2 for vpadding
        maxLatitude = maxLatitude + (int)((maxLatitude-minLatitude)*hpadding);
        minLatitude = minLatitude - (int)((maxLatitude-minLatitude)*hpadding);

        maxLongitude = maxLongitude + (int)((maxLongitude-minLongitude)*vpadding);
        minLongitude = minLongitude - (int)((maxLongitude-minLongitude)*vpadding);

        // Calculate the lat, lon spans from the given pois and zoom
        mapController.zoomToSpan(Math.abs(maxLatitude - minLatitude), Math
.abs(maxLongitude - minLongitude));

        // Animate to the center of the cluster of points
        mapController.animateTo(new GeoPoint(
              (maxLatitude + minLatitude) / 2, (maxLongitude + minLongitude) / 2));
    }
} // end of the method

W3C Widgets: The good, the bad and the ugly

It hasn’t been a while since ppk wrote about totally a new W3C movement called “Widgets“. A Widget is a downloadable archive of HTML, JavaScript, CSS and a configuration file. It’s a downloadable web front-end. Basically it’s designed to build mobile apps to avoid extra network usage consumed to download heavy weight pages, CSS and JS. With Widgets, you only consume network traffic for data transmission. Before getting into details I have to share a fact that according to my knowledge, Opera Mobile is the only browser around with Widgets support.

You can read Vodafone’s tutorial to make a Widget first to have an initial look.

The Good

For many years, I’ve been in a huge debate with people who uses work force inefficiently by their 35k different platforms and SDKs. Half of the developer have written HTML once in their life and JavaScript has a very large developers base. Every new mobile platform is usually re-inventing the wheel once again and this default action is usually driven by business fears.

Widgets make software accessible anywhere you can run a browser. It’s definitely “Write once, run everywhere”. And the complaints about slow page transmission is being fixed by running them from local resources.

Widgets will push mobile web browsers to act more similarly as applications base grow. Many of the extensions such as geo-location APIs dont really fit each other and some mobile browsers provide totally non-standard features. If web applications dominates the mobile, community will push browsers to act better.

It’s easy to get in. You dont have to download SDKs, learn another language and read documentation/tutorials to learn something new.

The Bad

Performance. Native apps run fast. Even Dalvik empowered Android is horrible and not really responsive compared to other platforms’ applications because of Java. Heavy JS on web browsers are not scalable and just like most of the other browsers, Safari on iPhone has rendering issues even on local websites.

Forget the advantages of Web when it comes to releasing software. No on the fly updates at all. Software should be downloaded again and again as new versions release.  Accessibility to internal platform is questionable. Open platforms like Android provide access to internals such as contact lists, file system and invoking other applications. If mobile  operating system manufacturers cant meet at providing the similar APIs, this wont work.

The Ugly

I find the old-generation of mobile development community is very ill-minded. They use the know-how to make money and this community is interested in their complex and closed environments.

On the other hand, the only contributor is Opera for now. I’m not really sure if they go for larger market share or not. If an open standard acts like a diverse platform for Opera browser phones, it’s the same story.

Custom Scroll Distance for UIScrollView

Most recently, I was trying to create a slider for users to navigate between different items. A scroll view was working fine since it implements most of the scrolling behavior I needed in my application natively. But the content I want to scroll was smaller in width and UIScrollView is designed to scroll multiples of its width. This was truly a problem. It was possible to scroll 2-3 items once a time and there were no focus, although I was looking for a one-to-one transition between different items.

There were possibilities to listen touches and calculate the positioning of the next item and scroll to it. But to be honest, I had no time to try out fancy and not-stable solutions. Instead of losing myself in the rules of UIScrollView, I wanted UIScrollView to get lost in me. Remind the rule: “Only scroll multiples of its width horizontally”. Great, so why not modifying scroll view’s size? Well, just because I want other items to be visible and lined together to give user a feeling that it is a slider.

 

Normally, that is where you stop, but there appeared a trick to make it work my way. I decided not to clip the subviews of scroll view and TA-DA! Images were lined up together and were visible even though they were not in the bounds of my scroll view. Very simple and clean solution. Inline note please: Before moving to the “how”, I want to point out there is a problem with this trick. You cannot interact with items out of the scroll view boundaries. If your items are tiny, this is a huge problem because scrolling will only be active for 50-60 pixels. Consequently, use this trick if items are at least %50-%60 of the whole screen.

Start a new window-based Xcode project. Create a view controller with a xib file. Open xib file and add a UIScrollView to the main view. Return back to the controller you created and add a property to connect UIScrollView. Return back to Interface Builder. Modify the width, height and positioning of the view. Connect controller’s scroll view to UIScrollView we created. Enable paging and uncheck “Clip Subviews”. Our scroll view is ready to be filled. On the viewDidLoad method, I’m going to add several images.

// Implement viewDidLoad to do additional setup after loading the view, typically from a nib.
- (void)viewDidLoad {

    [super viewDidLoad];
    int i = 0, cx = 0;

    for(;;i++){
	UIImage *image = [UIImage imageNamed:[NSString stringWithFormat:@"image%d.png", i + 1]];

	if(image == nil) break;

	UIImageView *view = [[UIImageView alloc] initWithImage:image];
	view.frame = CGRectMake(cx,0, scrollView.frame.size.width, scrollView.frame.size.height);
	[scrollView addSubview:view];

	[view release];
	cx += scrollView.frame.size.width;
    }

    scrollView.contentSize = CGSizeMake(cx, scrollView.frame.size.height);
}

And finally build and run. It is going to work.

Why should developers blog?

If I make an statistical study out of my friends and colleges who are developers, I can barely can say 10% of them are blogging. Is blogging a nightmare, a time waster, a cheap-seat show where bloggers act like significant people for them? I don’t know. I have one prediction: They don’t like writing. These people are tech savvy, they are not like my mother who doesn’t know how to publish on Web. They have sufficient writing capabilities their readers won’t complain about. They have weekends off and most of them are single.

I started my Internet fanaticism with writing in 1996-97. But releasing my first personal took another 10 years to happen, just a few years ago in 2006. Since I started to blog technically in a more personal area, my life changed fast. I have met dozens of people regardless of geological distances, interchanged great amount of knowledge, made friends, found people who can challenge me and found jobs. Besides these benefits, I have other reasons to blog technically:

1. It’s natural: You have passion for technology

You don’t have to have a reason. This should be very natural. If you are passionate about technology, you are also passionate about the trends in tech. Blogging is the top of the game since 2002-2003. Regretting the fashions won’t make you a cooler person. Early adoption gives a better impression. Running a blog and not making money out of it gives me one obvious signal: I’m passionate about what I’m doing, this is not just a job to earn my life.

2. Self promotion

Isn’t it obvious? Nowadays if you are not on the Web, you are nowhere. You will at least have a personal space to introduce yourself to the world of networks. It’s good because you can find others who are interested in similar technologies and are passionate for similar concepts.

3. Self improvement

How could writing improve my own abilities? It’s usually being asked. Being socially active is the best thing to find more people who can challenge your existing capabilities. I understand you were the smartest of your family, you had great marks at high school, was a top student at college and now a superstar employee. But due to the existence of your blog, everyday you have replies back to you from smarter people around to remind you are still a beginner. This is a great opportunity to see there are no limits and you are never done.

On the other hand, your blog serves as a timeline which captures your interests, technical knowledge and other capabilities. You can easily review yourself by taking a look at your older posts.

4. Sharing Knowledge

And obviously, sharing knowledge has its factor. It’s common for others to struggle where you went down. You may like to share an exceptional bug, a trick to make things work, a newly released product, methodologies, stories, experiences and reviews.

Blogging makes you a better developer. More people you meet, the better person you will be. Physically there are many constraints but an online representation of yours will fasten things. And as a blogger, I want to get that as an RSS feed! Now there is one thing I’m curious about. Do you blog or not? Why or why not?

Maps Development on Android: Registering a Maps API key

Location based applications are  musts on mobile platforms. Android does not have maps natively but Google Maps team is providing an add-on that comes with Android SDK (at least 1.5). In this post, I’m not going to show you how to pop out maps on your little mobile screen, but underline the application signature details related with Maps API.

First of all in our to show map tiles properly, we need an API key. This is all because you are requesting from Google Data API and have to agree with the terms of service. I’m also sure that quote rules also apply.

Every Android application is signed with a signature of the publisher. While obtaining a key, you must provide the MD5 summary of your signature to Google, and Google activates possible transactions between Maps API and the application your signature signs. In order to complete these actions, you have to

  1. Obtain the MD5 summary of your signature. If you do not have a signature, you can use the default one.
  2. Sign up for an API key directly from Google by providing the hash of your signature.
  3. Use API key with map elements and generate a sample map view.

Obtaining an API key

You will have to use the keytool to obtain information about signatures. If you haven’t created one, Android SDK puts a default one in your ~/.android directory. In this tutorial, I’m going to show you how to register with this default signature. Open a terminal prompt and enter

$ keytool -list -keystore ~/.android/debug.keystore

It’s going to ask you the password of the keystore (debug.keystore). Default is “android”. If you receive a MalformedKeyringException, you are giving the wrong password. If everything works great, it will output a few lines of information including the hash. Please read the summary line and copy the hash.

Certificate fingerprint (MD5): E8:F4:F2:BF:03:F3:3A:3D:F3:52:19:9B:58:20:87:68

After obtaining the summary key, you can jump to the next level — signing up for an API key. Give the hash as input and register. Please note the API key Google has given to you.

Generating Maps on Android

Android SDK comes with two archives. First one is the android.jar which contains the standard platform libraries. And maps.jar which is a library dedicated to generation of maps. In the maps API, you will notice MapView. You can extend MapView to customize and add new features to show a custom map view. Or invoke the existing methods to perform simple operations like panning, zooming and adding overlays to show information on the default map. There are great tutorials about Android’s map view and controller on web, I simply didn’t want to copy-cat the existing. Google’s Hello, MapView is a place to start.

Multiple-Developer Cases

A signature can only be associated with a single API-key. What you are going to do if development is made across a team? You dont need to create different signatures for each developer and register them to use Data API one by one. Register a single signature and obtain a key. Then, distibute the signature among the developers – better add it to your version controlling system.

Redis: New Persistent Key-Value Store

Most recently, I’m working on Redis which is a key-value datastore with interesting characteristics. It’s ultra fast and has built in atomic operations to handle concurrent usage. Although everything lives in-memory, Redis syncs with hard disk time to time to serve as permanent storage. Most impressively, downloading Redis and making a working build doesn’t take more than a few minutes.

Redis explains itself as a non-volatile memcached with various in-built data structures. Instead of only key and string value pairs, you can have lists and sets as values. There are atomic pop/push operations to work on these structures and increment/decrement functionality to work on numeric values.

Several client libraries are available including Perl, Python, Erlang, C++, Ruby, Scala and PHP. To write a more meaningful post, I’d like to add lines from a simple Python script.

import redis

storage = redis.Redis()
storage.keys("a*")  # returns keys starting with a

storage.get("key1") # returns the value of "key1"
storage.set("key1", "hello world") # setting the value of "key1"
storage.delete("key1") # deletes the pair with key1.

# working on lists
storage.push('key2', 'This is the first value', tail=True)
storage.push('key2', 'This is the second value', tail=True)
print storage.pop('key2')

Most of these methods are ported to client libraries and are available in downloadable Redis archive.

Although Redis can not be distributed, it’s easy to set up a slave node to replicate the master. Since it syncs with hard-disk in certain intervals, there might be data-loss in possible system crashes. So, setting up a slave may decrease the risks. It’s also advised to use Redis on a central server and manage sharding in the application level.