Uploading CSV Files for Tweets

TwitteyBot allows you to schedule tweets - the schedule part is an important usability element. We could work on making a complex web user interface that helps the user with all sorts of scheduling, but a simpler solution is possible. Allowing the user to simply upload CSV files that contain the schedule of the tweets would let the user use other tools like MSExcel, etc to schedule more efficiently. We just released this feature with check- in 71. This feature is deployed on the application also. The CSV file should have the following columns
  1. The date when it should be tweeted. This has to be in the format MM/dd/YYYY (example 12/31/2009).
  2. The time, in the format hh:mm (example 13:15). Note that the hour should be in teh 24-hr format
  3. The actual status message. This can be as long as you want and can be clipped using the web UI. 
When using MSExcel, you can use 'autofill' feature for scheduling. If the date or time filed is empty, the date and time of the previous row is used. A video showcasing the various features is posted below.



Watch out this space for more updates on the twitteybot. You can try out this feature at site, hosted at http://twitteybot.appsot.com

Google Search Link Copier

One of the greasemonkey scripts that I wrote was a utility that modifies the Google search result page. By default, the URLs of search items on the results page change to a Google Url when the results are clicked. Hence, when you visit a website from the search engine, you path is tracked. This is a good behavior for Web History, but does not play well while copying or dragging the link content. This does not work when the user navigates using the keyboard.
This post is about the update to the script that existed earlier. Apart from checking for automatic version update, the script has been simplified a lot. Google's search page has a "onMouseDown" event attached to all search results that change the URL. However, if the user is dragging the link as a bookmark or to a chat session to paste it, google's redirect URL is pasted instead of the actual page location. I am not sure why it is attached to the "mouseDown". I think it makes more sense in the "click" event.
This script adds an event handler to "mouseDown" that nullifies the link change by the page's script. It also attaches an "click" handler that restores the link that Google wants, for Web History's sake. I have also added the scriptUpdateChecker that checks for new versions of scripts. There is some discussion about this here, and I would update it once a conclusion is reached. Watch out this space for more updates.

Ubiquity Command : Bookmark on Delicious

One of the ubiquity commands that I have worked on includes "bookmark" to delicious. The command was using automatically generated tags from a service which is currently dead. I stole a few minutes to quickly change the Tag Generator Yahoo pipe that was feeding the command with tags.
The Tag Generator now picks up the key terms indicated in the YAHOO BOSS search result for the web page. You can take a look at the modified pipe here.Unfortunately, the only part that did not was the use of delicious tags. For some reason, the delicious xml element in the response disappears when the pipe hits the filter module in the source code.
In an eagerness to test, I headed to delicious. Unfortunately for me, I had just linked my account with my YAHOO Id, rendering the V1 APIs useless. The V2 APIs require OAuth, something that is currently not supported directly in ubiquity yet. I was planning to write an OAuth library for ubiquity, but that is for later. Hence, the bookmark command is broken for now, if you are using the newer delicious accounts. I am falling back to the share-on-delicious built in command to get my work done. This command circumvents the requirement for OAuth by picking up the cookie from the browser using native Cookie Manager (
Components.classes["@mozilla.org/cookiemanager;1"])
method of firefox.Interestingly, the AJAX call is also made by fetching the XMLHTTPRequest object in a native way. Watch out this space for more updates on the OAuth utility and my other ubiquity commands

Twittey-Bot :: Technical Details

Looks like Guy Kawasaki was talking (point 6) about using twitter for web 2.0 publicity again, at the recent Nasscom - Tweetup meet. Twitteybot is just that, a dead simple tweet automator :). This post deals with the technical details. The source code is available here. The overall architecture of the website is simple.
The application is hosted on the Google App Engine and uses the built in persistance, user accounts, cron jobs and tasks.
Starting from the persistence layer, there are are two major objects; one to store the twitter account auth tokens, and the other to store actual tweets that have to be scheduled. There is another object to stores the application properties and it primarily stores the twitter oauth key and secret. I put them in the database instead of a properties file to enable revocation, etc without having to redeploy the application.
The servlet layer simply fetches this data using either the TwitterAccountManager or the StatusManager classes. The UI makes direct calls to these classes to manage the data at the backend.
The UI is simply an interface to modify the objects at the backend. The only other functionality are the tool boxes at the left that let a user schedule and shrink the tweets. These are purely javascript functions that modify the values to be sent to the backend.
The cron job runs every minute, checking for tweets, older than the current time. If it finds any, it adds them to a taks queue. The Task queue looks at the tweets, one after the other and is responsible for posting them to twitter. The twitter oauth credentials are saved in memcache as they would not change very often.
A lot of code has been written, and a lot more need to be. Please take a look at the site and criticize, suggest or even appretiate.
Here are the list of issues that I would be working, once I get back. I would definitely welcome help, so if you are interested in churning out some code, do drop me a comment.

Finally, I have a website !!

A weekend of hacking, some static HTML pages spiced with CSS and Javascript and I have a presence on the internet. I just published the last commit to github and have a decent looking website up and running. The site is available here
Some of the requirements that I had set are
  • Progressive Javascript, the site should work without CSS/Javascript also
  • Only static content would be possible as the site is hosted on github
  • Most contents of the site have unique URLs. 
I would not rant on the design or the code of the site, instead I would simply highlight two interesting challenges that I came across while building the site.

The site is static, but I wanted the home page to have posts from this blog displayed. I also wanted to show randomly picked projects too. The feedburner javascript had all the content preceeded with a document.write. Since I was loading this content using jquery's $.load, document.write was executed in the window content and hence, overwrote the page contents. The way this was solved was to load all this content in an iFrame. The iFrame calls a method in the container document with the HTML content that is displayed with the formatting.

The second challenge was having fragment identifier as a navigator. Though a well known pattern, the only deviation in this case was that the module names and HTML should be the same. The javascript does not poll for the URL fragment identifier yet, but it does change when the user navigates to a project, etc.

Please do check out the site which is more of a cataloged archive for this blog.

I can see the future .. thats why I schedule my tweets

That was a neat little catch-phrase to introduce the project that I have been working over the last weekend - Time2Tweet. The project helps compulsive [or lazy :) ] "tweeters" to  schedule tweets that are uploaded file.
The idea started when I was looking for a tool that lets me automate tweets, typically send things like quotations, geek humor and other things to my twitter account. There are tools to update twitter from RSS feeds or create tweet publicity, but nothing for bulk scheduling.

The application is simple; all that the use does is logs into the application (the application is on Google App Engine, so logins use Google Accounts) and authorizes twitter accounts. Once this is done, the user uploads a file that contains one tweet per line. There are scheduling options that let the user specify the time interval or the total duration all the tweets should take.
The user interface was inspired from Google Reader, and the application on App Engine uses features like cron jobs and tasks lists to get the job done.
One discussion that came up was regarding the requirement to have a use log in with Google credentials. Though this method allows users to manage multiple accounts, most individuals would schedule only with one account. Maybe I would fork a branch sometime soon for implementing this variant.
The typical uses of this application could be
  • Help companies send marketing messages easily
  • Bots that tweets aimed at teaching foreign languages, one tweet at a time. This, I presume, is a lot easier that reading books.
  • Create bots that quote tweets from books like the Gita, Quran or the bible.
In a post that would follow, I plan to post the technical details. In the mean time, if you would like to take a look at teh code, it is hosted here and the application is available at http://time2tweet.appspot.com.

Duration Input Component

Free from, input box style controls are always fascinating. Its always a pleasure to type in the dates and the computer understand it automatically. I came across the  a Date processing library that parses such free form dates. I was working on the TwitteyBot over the weekend and I needed a component where a user could enter time durations.
As a requirement, the user would enter durations that resemble any of the following text
  • 1 day, 10 hours and 15 minutes
  • 1 minute + 1 hour
  • Duration is 1 minute 
I tried writing a parser, similar to Date.js and here it how it looks like.
The source code for parsing this is simple and self explanatory.

Autoupdating Greasemonkey Script

Firefox has automatic updates, its extensions have it, even ubiquity scripts have it; but not greasemonkey. It is not a great user experience to have to update greasemonkey scripts every time the site changes.
The yousable tube fix script has a module that takes checks for new versions of the script and automatically updates them. The idea was simple and effective. The page where the greasemonkey script is hosted has a unique string somewhere in the page, that looks like #[V:1]#. The number "1" in the string denotes the latest version available.
The update module is configured with the location of the script as identified by the number in the URL of the script on userscripts.org. The script also have a version that indicates the version of the script file. The module fetches the script page using GM_XMLHTTPRequest and searches for the version script. If it finds that the number on the page is greater than the one specified in the file, the location of the script is opened. Since greasemonkey is installed, the *.user.js file is recognized as a script and the greasemonkey dialog shows up.
To add this functionality to your greasemonkey script, just copy and paste the following snippet and configure the following.
  1. Paste this scriptlet at the bottom of your script.
  2. Set scriptPage to the number in the URL of userscripts page.
  3. Set scriptVersion as the version of the script in source code.
  4. In the description of the script, add your version number as
    #[V:2]#, where 2 is the integer version number.
You can find the source code for the update module here.


Reddit Bar :: Couple of changes more

Reddit made a couple of changes on their site earlier this week, breaking the current implementation of the reddit bar. This post details the two quick fixes to get the script working again.
The first fix was to remove the 1x1 pixel image that seems to appear on the reddit page. The picture, present at http://pixel.reddit.com/pixel/of_destiny.png?v= seems liek a tracker image. This change is
The second change was to remove the correct content; apparently, another DIV was assigned to the content class due to which, the actual content was still showing. The iFrame containing the page gets rendered below this. The change was to hide content[1].
You may have to install to the greasemonkey script again to get it working. I am also working on a generic way to update greasemonkey scripts, something I picked up from Yousable Tube Fix script.

Ratings between the half and whole

Most rating widgets I have come across show stars in half or whole. This is usually implemented using images corresponding to half, full or empty stars. The user hovers over the widget to set the rating, again between whole and half stars.
However, sites like IMDB and Orkut have more granular ratings, showing partial stars. The implementation was interesting and in a way, more efficient. Doing it seems to be simpler than the older way of showing ratings.
Two images that look something like and are used to create the effect. The second image (or empty stars) is used as the background of a DIV. Inside this DIV, is placed another DIV, whose background repeated image is the first image (or filled stars). The width of he inner DIV depends on the rating. User interaction would be similar to sliders, with mouse drag replaced with hover.

Gmail Burglar Alarm : Statistics now Inside Gmail

Hi,

A few days ago, I came across the Twitter Gmail Gadget. It has a neat interface with all data displayed inside Gmail; no external pages, etc. Since Gmail Gadgets are also opensocial applications, they have a canvas view, in addition to the profile view that show up on the side bar. The canvas view takes up the larger part of the screen.
We were also running out of space at the bottom of the gadget that had cryptic options to set the various parameters. The latest release has just one link at the bottom of the gadget visible on the left pane. Clicking on this takes you to the larger details view that opens in the right pane.
The right pane shows the statistics recorded in Google Calendar. It also has a tab showing the visits recorded by bit.ly. If a URL is not configured, there is text in the tab trying to explain how URL trackers like bit.ly can be used and how a bit.ly URL can be configured inside the gadget.
As far as the code is concerned, the change was not much. Previously, the two pages were opened using window.open. Now, they are simply shown as iFrames inside the tab pages. The height of the gadget is also adjusted after user actions for aesthetics.
If your gadget has not refreshed automatically, you may have to delete your existing gadget and add a new instance. This sure may delete the total time shown, but that should be fixed in the next release. I am also planning to have a home page for the gadget site hosted on Google App engine.

Chroma-Hash - Gradient implementation using Canvas

A few posts ago, I had written about an implementation of Chroma-Hash using a huge background image. The idea was to use the image for generating a unique gradient as the background of the password field.
Some more changes were made to the idea a few days ago that included using canvas for HTML 5 browsers as a gradient generator, and using a salt unique to every user. The project is is available here. The additions in current HEAD revision are as follows.
When "canvas" is selected as a visualization type, a canvas with opacity less than one is placed exactly on top of the password box. Since the canvas is placed above the password box, it may intercept all clicks intended for the password box. Hence, whenever the mouse moves over it or clicked on it, it is hidden. It shows up again after sometime, or if the mouse is moved out of the password box. Apart from this, the hash of the password is taken and a gradient is drawn by splitting the hash into 4 colors.
Another change is the addition of a salt to prevent hashes from being recognized. The salt is unique to a user. Similarly, an option was added to get the salt out of the domain name; a way to protect against phishing. As too many changes to the salt makes it hard to recognize the gradient, I am currently working on a way to indicate the domain color as the starting of the gradient instead of the salt.
The final change is the upgrade to the greasemonkey script. The Greasemonkey simulator was for testing. Currently, it inserts the script into the page and Chroma Hash is activated. One side effect of this is that the script can be found in the edit area of the pages. The next release would remove this and bake the Chroma Hash logic inside the greasemonkey script. Watch out this space for updates.

Ubiquity Command - Linkify upgraded to Parser 2

A few posts ago, I had written about the changes to ubiquity commands due to the change in the ubiquity parser.r. This post details the changes to the linkify command.
As written earlier, the preview of the command cannot be used for interaction. Hence, the command does not take any inputs except the selected text when it is invoked. Considering this as a search term, the command uses YAHOO BOSS API and displays search results like its earlier versions. The only difference here is that the search results and the other UI is embedded on the page.
The user can change the search term before clicking on the appropriate link that makes the selection a link. The only trick here is that when the user changes the search term, the selection on the page changes. Hence, the selection is stored in a local variable to be used when creating the link. When the user clicks the link, the saved selection range is activated and the browser selects the same text when the command was activated.
You can subscribe to the command by visiting the command page.

Screen Scraping with Javascript, Firebug, Greasemonkey and Google Spreadsheets

Most of the web page scrapers I have seen are usually written in PERL, mostly due to the power of Regex. However, nothing could get friendlier when parsing web pages if not Javascript.
The pages to scrape are usually like this or this.
With the power of regex inherited from PERL, DOM parsing from Jquery and its likes, scraping data from web pages is a lot easier. This article outlines the various techniques that makes screen scraping using web pages easy.
Writing a web page scraper usually involves the following steps.
  1. Identification : identifying Page Elements
  2. Selection : getting data out of the selected nodes
  3. Execution : running the code in the context of the page.
  4. Submissing : saving the parsed data so that it can be used later
  5. Timeouts : to introduce delay so that the server is not overwhelmed.
Identifying nodes can get tricky, but with Firebug, its a simple point and click exercise. Firebug gives us the entire XPath to the node in question. Web Scraping usually involves picking up data from a structured document that has elements repeated in some pattern. "Inspecting" elements would reveal a pattern, usually a common class name or a hirearchy.

Once identified, we could get the code to select all the elements interactively using the Firebug console. This would usually be a combination of XPath expressions, getElementsByTagName, getElementsByClassName, etc. Once you have all the elements (as rows usually), you could possible dive deeper into each element till you extract the innerHTML or href. This is what goes into the code.

Once you have the code returning useful data (checked using Firebug console), you would need a way to run it for the page and possibly load the next page for parsing once the current page is done. The javascript code could be inserted using a bookmarklet, but that would require the user to click the bookmarklet for every page. I choose to add the small greasemonkey header and convert these scripts to greasemonkey scripts. This ensures that the we parse the current page and move the the next automatically.

Once reason why most people don't use javascript for parsing is its inability to store the parsed data. This is where Google Spreadsheets come to the rescue. Google spreadsheets lets us create forms to which we can POST data. The script would need to create a form with its action set to a url that resembles "http://spreadsheets.google.com/formResponse?formkey=yourkey". You would also have to create input elements with names resembling entry.0.single, entry.1.single and so on. You could check the actual data submitted to spreadsheets using Tamper Data. So we now have all our data in a spreadsheet, giving it sorting and filtering capabilities.

The last point is about preventing loops and switching to timeouts instead. This would ensure that you don't overload the server with requests. A good feedback mechanism, something like coloring the background of fields that are parsed successfully, would be an added bonus.

To conclude, scraping data this way may require your browser to be open all the time, but some of the benefits over the command line way (that I could think of ) are

  1. Easiest way to handle the DOM, no regex, just DOM traversing
  2. Visual indication of what data is parsed, real time
  3. Proxy and TOR configuration out of the box if your IP is blocked :)
  4. With webworkers, complex parsing could be done a lot easier
Writing all this from scratch is hard, may be you could use this template. The templates is littered with placeholders where you could insert your code.


Abode Dev Summit


Attending the Adobe DevSummit at Bangalore.

The Chroma Hash effect

There has been quite a lot of discussions on the usability issues with password masking. I came across efforts like using a mask, the hash represented with a graph, and more lately representing the hash with colored bars. The idea was interesting and I decided to fork my own project and experiment with other visualization techniques.
The idea is to represent the 32-long hex in a way that the user easily recognizes. The representation should also ensure sufficient delta between two similar passwords. With a
3.402823669209385e+38 different representations, finding the right visualization may be hard. The plan was to divide this by four and display unique pictures that are fetched from the internet, but getting a consistent source of so many pictures is hard. Though the number seems huge, the practially used has values would be far less. Compromising for accuracy, a gradient image was used as a background to the text box. The backgound-position of the image is moved to display different gradients for different passwords. The image is 3000x900 and hence position is the hash value modulo dimensions.
The forked project is also written in a to add other visualization techniques easily. You can also find a quick greasemonkey script that adds this feature to all password fields here.

Easy Installation of RSA Adaptive Auth and SecurID using Google Friend Connect

A few posts ago, I wrote about protecting gmail using RSA SecurID. This post details how the idea is extended to protect any website using the Google Friend Connect way. The video below show how a RSA SecurID gadget embedded into a page using Google Friend Connect works. This demo also shows how the RSA SecurID authentication service could be availed as a service by smaller websites.



The gadget loads a SecurID protected page and a login page hosted as a service. The page loads up a new window with the login page where the user types in credentials. Once the login is successful, a authentication token is set in the main website's page that other gadgets can use. Checking for actual login would be done using a protocol similar to OAuth where "isAuthenticated" request would be made to the RSA Authentication service to check for authentication.
The demo was also extended for hosted RSA Adaptive Authentication solutions that websites can easily plug into their websites.

Fetching the numbers from SecurID for a web page

It started out as a joke but ended up invoking a quite an interest in the idea. Though I would not be permitted to discuss the idea in public due to confidentiality clauses, I thought it would be fun to jot down the way it was implemented in a 2 hour span for the RSA Hack day.
The idea required the random numbers generated by RSA securID for using it elsewhere. Interestingly, the algorithm is protected and there is no direct and simple API to get the numbers (due to obvious security reasons).
This was a smashup challenge, and a hack to demonstrate would be welcome. Since there was no direct way to get the numbers and reverse engineering the token was a lot harder, we decided to pick the numbers from a software token.
All the seeds for a user were installed in a couple of software token instances. When a user logs in and the token code is requested for, we load the required token in the software token. The software token is manipulated using macros. On the background, Snagit is set to pick up screenshots every fifteen minutes of the area showing the token. The screenshot served using a servlet on Tomcat that is called every 10 seconds. The servlet also deletes all but the latest image to keep the size of the folder in check. This image passed to an online OCR service that returns the required numbers.
A long way to get the numbers, but the implementation was fun. sheer hack !! :)

Updates to Ubiquity Scripts - Parser 2

Ubiquity Firefox extension upgraded its Parser to support a richer set of nouns and i18n. This rendered a couple of my ubiquity functions unusable. The commands I wrote could be found here and here. I have been able to port the "bookmark to delicious" command to the new version, but the "linkify" command seems to have problems.
Porting the "bookmark to delicious" command was simple, I just had to change the noun definition. Since the page does not really take any arguments, the only notes are those selected on the page. Thus, only the command name had to be changed to get it working.
Converting the "linkify" command was trickier though. The preview in the newer version seems too slow to be used to any interactivity. Hence, the user cannot really choose the search result that suits the page context. Looks like we would have to remove the use of preview pane and create a popup instead. This popup would let the user click on a link that would be the hyperlink for the selected text. This new UI would also let user link words with arbitrary search terms and looks through the pages of the search results. Watch out this space for the upgrade that I plan to work on, during the weekend.

Dynamically provisioning data centers with enVision and VI SDK

One of the most important characteristics of a cloud deployments is the ability to scale dynamically when severs seem loaded. Some of the metrics used for scaling include CPU, Memory and Bandwidth utilization. However, in most cases, these metrics are local to a specific system. The dynamic provisioning of additional capacity is also reactive to peak demands. This directly translates to loss of response for a short interval of time when the capacity is being allocated.

Enterprise deployments usually are a collection of heterogeneous system with well studied patterns of stress propagation. These stress patterns usually progress from geographic location to another or from one type of server to another (web server to database, etc). Hence, a provisioning system on a global scale would allow a proactive provisioning system, adding computing capacity to the correct type of servers.

One of the ideas we presented at the RSA sMashup challenge was to demonstrate this dynamic provisioning on a global scale. We picked up RSA envision to collect logs from servers deployed on Virtual Machines hosted by VMWare ESX Server. The trigger that provisions more machines are configured into the reports and alerts at EnVision. The alerts call a batch file that contains VI SDK commands to create and start servers.This file takes care of cloning the machine, bringing it up, etc.
Thus, envision alerts when it notices loads on servers that in turn provisions more servers that are prepared for the load when it progress to them.

Encrypting Data before Storage on Cloud

With the cloud offering almost limitless storage, most data owners end up trusting the cloud provider for confidentiality and integrity of data. There are cases when it would be desirable to encrypt data before it leaves our systems to the cloud. Many enterprise deployments are already equipped with key management solutions and this could be roped in to manage keys used to encrypt data stored on the cloud.
For the sMashup, we hooked up RSA Key Manager and EMC Atmos cloud storage. The result was a transparent API layer over the existing Atmos API. Here is how the code looks for encrypting while uploading and the reverse while downloading data. The files are available here.
The code shows how files could be uploaded and downloaded. The code could also be used as an API to encrypt and decrypt byte streams making it a stand alone API. Since it is built on top of the existing ATMOS api, it becomes easy to rope it into existing projects. Here is the demo that we used for the 90 second presentation.

RSA sMashup


Time : July 7th and 8th, 2009
Location : RSA, Bangalore

Will post the ideas later.

Update to Reddit Bar

Here is a quick update to the reddit bar greasemonkey script that I had written. Looks like the id of the title is now removed and this led the greasemonkey script to stop functioning. A quick fix and it is back to normal.
Instead of relying on the id now, we iterate through all the tags that have the "title" class. This also selects the hyperlink to the article. To find this from the array of "title" class tags, we do a simple match to check if the tag has the document's title as the innerHTML. The title of the window has the "name of the story : Reddit". If the node is a link and the innerHTML does have the appropriate part of the document title, it is returned as the targetURL which should be used for the iFrame. Back to normal, I can not upmod stories right from the comments page. You can check out the script here.

TwitteyBot updates

I had written a Google app engine based application to create and manage bots that feed data into twitter. The application is similar to what twitterfeed does, with some additions. The applications is now up and running at http://twitteybot.appspot.com where you can register to activate feeds.
The data structure used is simple, but I am afraid it may not stand up to the scalability requirements. The code is hosted at http://code.google.com/p/twitteybot.
The persistence object is structured as User > List of Twitter Accounts > List of Feeds > List of Status Updates. Being a unidirectional owned relationship, when the status messages are fetched from the feed, they cannot be directly inserted into the Status Updates object without a reference to the user. Another problem with the status fetch cron job is that currently it requires a user name. This also has to be changed before the bot is fully functional. The twitter accounts also require username/passwords. These can be changed to support OAuth, but thats something for the future. The user interface is also plain HTML based, it has to be changed to make AJAX changes to the database. That should make the product more or less complete.
On the content side, there still needs to be work for getting content for bots. I was planning on bots for The Bhagwad Gita and Thirukurals. The Bhagwad Gita is on a webpage, not parsable by Yahoo mail. I would have to curl (download) it all to a text file and then upload the file. Watch out this space for updates and please do drop a comment if you would like to install the bot in your app engine account.

Using Bit.ly to track Gmail logins.

I had written a Gmail gadget that lets you track your usage of gmail. The tracking data was stored inside your own Google Calendar. The calendar was also used to send SMS messages if anomalies were detected in logging into the account.
The only think the gadget seemed to lack was visualization. I wanted to use Google visualization some day, but could never find enough time to get it done. That is when I decided to piggy back on the visualization capabilities that bit.ly provides. I had written about the way to track rogue visitors. The same scheme can be applied to this Gadget also.
An extra link was added at the bottom of the gadget where the user can specify the bit.ly URL. This is saved as a user preference and an invisible image with this link is appended everytime the gadget is loaded. Since a call is made to the bit.ly url, the access is tracked. The code is now uploaded to http:\\gmailids.appspot.com. The source code is also available here. To enable this change, you may have to add the gadget again (which may result in losing the existing data about how long you have spent on gmail), or when Gmail picks up the data.

Twitter Bot on Google App Engine

Over the weekend, I have started working on another application to be hosted on the Google App Engine. This application is a bot that reads from a set of RSS feeds and posts the updates to a specified twitter account.
On the face of it, this is a simple project, but there are some interesting intricacies that I wanted to put down in this post. The idea is to create a management dashboard that does the following
  • Allow users to specify which twitter accounts to post to
  • Twitter status update interval
  • Feeds that will act as a source for status
  • Interval for fetching feeds
  • Multi tenancy, supporting multiple users for bot
The Google App Engine database has to be used to save the updates to handle the difference in frequency between the feed and twitter updates. Watch out this space for updates.

Inside the dragon fish login demo

Dragon Interactive Labs have excellent demos on how javascript can be stretched to create some awesome effects. This post details how the dragon fish demo login screen is done using pure javascript. The colors around the login box and the glow are done using background images and changing its position. The first layer is this image that is set as a background image changing its left all the while.
On top of this div are the masks for the four corners and the four sides. These have images with semitransparent pngs that are respinsible for the glow effect. There is also a mask in the middle that masks the majority of the background.
It is above this layer that regular fieldset containing the username and password fields. You can see this effect if you move the third element with "map" classname with to an absolute position and 200 px left. The javascript constantly move the huge image to left, creating the effect.

Weave Identity : The auto login feature

A few days ago, I had written about the interesting similarities between Microsoft Cardspace and the way Weave works. This post is about the other interesting feature - the auto login. The idea is to save the user name and passwords on the weave server, retrieve them when we are on a login page and submit it.
If you unzip the xpi file, you can look for the LoginManager.js file. This is the file that is responsible for saving user credentials, retrieving them when required and submitting the form. Most functions are self explanatory. An observer is called whenever a form is submitted. The credentials are then picked up and saved to the store.
The _fillForm function is looks for matching logins saved for the login form and fills in the details. There is some complex logic that determines if a password is correct or not, but it can be ignored for now. It is something like the form.submit() function that I was searching for, but unable to find. I was also wondering how the extension handled password submission for sites like meebo.com that hash the password before sending it across. Watch out this space for updates on how these login form oddities have been handled.

Reddit Bar :: adjusting the extra space

A quick update to the reddit bar greasemonkey script. Looks like the reddit page updated its code to add some extra style to the comments page. This added an extra space under the reddit bar that was inserted using a greasemonkey script I had written. I just got some time to fix it and you can find the final code here.
The addition was a change to the style of the footer. The div with class ("footer-parent") had a padding-top set to 40px.
A single line to set this to 0px did the trick and the extra space is now gone.

Weave Identity : in contrast to cardspace

There have been multiple announcements about Mozilla Weave adding an identity angle to its offering. A video detailing the auto login capability is available here. Weave enables automatic login using OpenID enabled sites behind the scenes. They also submit the login forms automatically when user credentials are remembered. As explained in the video, the "Sign in with Weave" button appears when login with OpenId is permitted. Clicking on the button would automatically perform the re-directions and log the user into the site.
This seems to look exactly same the like Windows Cardspace initiative. The similarities are interesting. Here is the workflow laid out side by side.

Steps

Cardspace

Weave Identity

1

User loads the login page with Cardspace enabled.

User loads a login page that accepts OpenId Credentials.

2

User selects the “Login with cardspace” button on the page.

User selects the “Sign in with Weave”, or the button on the address bar.

3

Cardspace UI is displayed and the user selects the card that best represents the identity for the site.

There is just one weave identity for now. It gets automatically selected.

4

A request is made to the identity provider site to get details by Cardspace.

A browser redirect takes the user to the OpenId identity provider page (services.mozilla.com).

5

If the identity provider requires credentials, a dialog box shows up.

Since the user is already logged in using weave, no credentials are requested for.

6

Credentials are sent to the Identity provider and if authentication succeeded, positive reply is received.

A redirect by the OpenId provider to the original page with reply.

7

User is logged in successfully

User is logged in successfully


It is the second part of weave is the way the login manager works. The way it saves usernames/passwords and auto submits forms is interesting.
More about it, and analysis in the next post.

Facebook Konami Lens Flare Effect

The Konami cheat code finally made its way into facebook. There were a lot of posts that spoke about the up-up-down-down-left-right-left-right-B-A-Enter code and the lens flare effect on facebook. I did a little probing and here is how it is implemented and can be reused in your own site.
First, there is the handler that catches the secret key. You can find it in this file. Just search for
[38,38,40,40,37,39,37,39,66,65,13] which represent the character codes for the secret key combination. The function is called onloadRegister which typically registers a key handler on the window. The function effectively loads this response that in turn loads the file responsible for the lens flare javascript file.
The lens flare is initialized using the following code after the required javascript file is loaded.

var lf = new LensFlare("http:\/\/static.ak.fbcdn.net\/images\/lensflare\/*?8:162521");LensFlare.flareClick(lf).flareScroll(lf).flareKeyPress(lf);
This is done after some variable are set in the win.Env object, the use of which is still a mystery to me. To execute this in our page, we need to initialize some objects, defined in this file. To get lens flare running, this file also has to be included. Here is the consolidated code that you need to use.

Paste this in the browser URL first,


and then, after sometime, to activate the flares, paste this.


You can try the code on this very blog window.
To include this in your page, you have to include the following script in your page.



The interesting part in this is the animation of the images to create the lens flare, another example of fine javascript wizadary !!

Reddit Bar :: Self links fixed

A few posts ago, I had written about the greasemonkey reddit bar. Swaroop pointed out that the reddit bar was not working fine for self links. It does not make sense to show up the reddit bar on such pages.
I have changed the code to reflect the change and you can find it here. The change was trivial. Just find out the target URL that would open the iFrame, and if it equals the document's url, the bar will not load, letting the page stay as it is.
Swaroop also had pointed out that the reddit bar shows up twice and I am taking a look at it. If you have seen errors, please do sent me the URL to help me fix the error.

Bridging the Gap between the internal and external could - RSA Access Manager to protect resources on Google app Engine

Here at RSA, one of the areas of constant probing is the idea of using existing RSA applications to bridge the gap between internal infrastructure of an enterprise and its extension on an external cloud. With a long weekend at hand, I worked on this hobby project. This post details ways to seamlessly protect the applications that you have moved to an external cloud with an existing RSA Access Manager installation.
The only interesting fact is that Google App engine is Platform-as-a-service (as opposed to Amazon EC2 which is Infrastructure-as-a-service).
Here is a deployment diagram and the various components involved. As shown in the diagram, the identities of the user rests in a LDAP or Active Directory inside the enterprise domain usually protected by a firewall. Apps 1 and 2 are deployed inside the domain of the enterprise while App 3 is deployed on Google App Engine (different domain) . The user may access the resource either when he is inside the enterprise, or when outside from a kiosk, etc.

Authentication :
In all cases, the user should be authenticated using credentials of the enterprise. Google App Engine credentials may be used for single sign on but it is not mandatory.

Authorization :
The system administrator should be able to define access management rules for the user based on the user's identity inside the enterprise.

Here is how RSA Access Manager can be configured to access the authorization-rules server to determine access to users. Firstly, we would require an Access Manager agent at the Google App Engine server. App Engine Java version provides Server filters that intercept requests before they access the actual resource. Here is how the work flow proceeds.

  1. User accesses a resource on App engine. He is not logged into Google or on the enterprise.
  2. The filter intercepts the request and does a URLFetch to the Access Manager Authorization-rules engine, deployed inside the firewall.
  3. This communication is secured by using Google Secure Data Connector. This could also have been secured using mutual SSL if URL Fetch would allow it.
  4. The server sees that the there is no cookie set in the request and replies to the agent that authentication needs to be performed.
  5. The agent sends back a redirect response that throws the user to the authentication page hosted inside the enterprise domain.
  6. User authenticated at this page. If the user is inside the enterprise, he could be authenticated using NTML, etc.
  7. If credentials are correct, a cookie for the enterprise domain is set.
  8. The user is then redirected to the resource page, carrying a message from the server to set a cookie for the Google domain.
  9. The agent sets the cookie and again redirects to the resource page.
  10. Since the resource is still protected by the filter, the filter kicks in again and does a URL Fetch to the server to see if the user is allowed.
  11. The server sees the cookie that is sent in this request and decides on if this user is allowed access or not.
  12. The agent sees the reply from the server and redirects or blocks the user, depending on the response.
This case become a little complicated if resources are protected using Google credentials, as specified in the app config file. This is the case where single sign on and federation is required; details about this in a later post.

Hence, using this simple filter, you can protect your applications that have moved to the external cloud just like resources are managed inside the enterprise. Adequate measures to secure the communication between the agent in App engine and the Authorization server ensure that this deployment is secure like it was when inside the firewall.

Flash Resizer : Toggle it on and off

This is a continuation in the Flash Resizer Bookmarklet series that I had written about some time ago. This post details the capability to toggle the bookmarklet on and off. A couple of people had written to me complaining that the resize capability prevented drag and drop functionality inside the flash. All drag and drop events seem to be picked up by the event handlers assigned by the bookmarklet. Hence, the toggle functionality could be used to remove the event handlers.
The trickiest part in the implementation is maintaining the state between two invocations of the bookmarklet. Since every invocation of the bookmarklet restores the object, all state information should be saved in the Global Window object.
When the resize handlers are assigned, they are also pushed to a global array. On subsequent invocations, the presence (undefined or not) of this object can be used to toggle active or suspended state. If the resizer is to be disabled, the destroy method ofYUI resize is called and elements are popped out of the array. Reactivating would require adding handlers to "move-handler" class elements.
You can drag and drop "Flash Resizer" to your bookmarks toolbar to check out the latest version. Click it once on any page that has flash on it to be able to resize the flash. Thereafter, you can click it to toggle activation and deactivation.

Launchy plugin for WCD

I have found the "Whereever Change Directory"(WCD) tool very useful on the command line. Its a directory changing tool that changes to the directory without requiring to type the whole path. Porting it to Launchy as a plugin would be a great utility.
The windows version of the WCD tool uses a batch file to perform the directory change operations. A batch file by the name wcdgo.bat it created at a location indicated by the wcd.bat file. To use it with Launchy, we would need to append a "start ." statement to the file. Here is the modified version called wcdLaunchy.bat, forked from wcd.bat. This is the file that has to be fed into the Runner plugin with an appropriate keyword in Launchy.

Here is how the batch file looks like.

Flash Resizer : Technical Details

I had written about a bookmarklet that allowed Flash Content on a web page to be resized. This post details the technical details of the code.



To start with, we iterate through all the flash elements - Object tag for IE and Embed to everything else. For flash elements that are opened with wmode="window", we redraw the element with a wmode="opaque". This is done to allow other HTML elements to be drawn on top the element.
Transparent Divs are drawn over flash elements. Click handlers are added to the divs that recreate the flash elements on an "position=absolute" div. This div can be resized and moved (with the YUI resize component). The flash elements with a 100% width and height resize with the div.
There are only a few caveats.
  • Shockwave Flash still appears on top of all HTML elements
  • Certain flash files do not scale and are fixed. Hence, resizeing the flash does not scale the contents, it just expands the element. Now working on ways to fix this.

Flash Resizer

I have often noticed that Flash content on many sites can be better if they could be resized to larger dimensions. Sites like youtube and vimeo do allow full screen videos but it does not really work on multiple monitors where you would want to work on another application. Games also would look at lot better on sites like onemorelevel and miniclip if I could simply maximize the flash and hide other annoying advertisements.
I wrote a simple bookmarklet in the past couple of days that does just that. You can select any flash content, move it and re-size it by dragging the corners. The code is available is here. All you would have to do is add this FLASH RESIZER link to your bookmarks or drag and drop it to the bookmarks toolbar.
Here is how it works. A post about how the code works would follow.

Reddit Bar - Greasemonkey Script

Before Digg released the Diggbar a few days ago, many users may never have bothered to visit the comments page or upmod a story. It was just too much work to upmod a story, specially when the Digg Redirect script took us directly from Google Reader to the story. With the digg bar, it becomes a lot easier for users participate.
I think that the Digg Bar is a great idea and wantedto replicate it for reddit too. Here is a greasemonkey script that you can install to see a similar bar for reddit. Users can toggle between the page and comments by clicking the comments. I find this particularily useful when hitting the comments page from reddit RSS feeds.
The code is simple, it just removes the header, the footer and the right div that displays statistics. It then picks up the details of the story from a div called "sitetable" and places it in the header. The last step is to allow a toggle between the iFrames that displays the target page content and comments.


Tracking those stalkers on Orkut

A few days ago, I had written about a service that would help tracking URLs posted on twitter. I had barely started working on it when a friend told me about tr.im. It is a URL tracking service with statistics.
Apparently, this can also be used for the other use cases - tracking on Orkut and forums. The redirect can be a simple transparent image placed in the forums or Orkut scraps. So, the next time you get a scrap from an annoying stalker, just add a transparent png using tr.im to the scrap that you send. You can also drop the image into all emails and forum posts that you make. This would give you an idea on how far your mail travels or how many people view your posts.
Though the service was written for twitter, you can use it for a variety of services.

Greasemonkey on the server

A search for the term "Greasemonkey on the Server" would lead you to a project that uses aservlet filter to insert scripts and seems to be quite old (and not under development). An alternative suggested in the forums, called SiteMesh seems to have the same fate. Even if we were to use them now, they work only with code that you own.
The framework I wanted should have the following
  1. Proxy any webpage, letting me insert my own scripts into the page
  2. Should not require me to configure the server as a proxy. Instead, I should be able to hit a specific url to view the page
  3. Images, CSS, and even AJAX should work like it would work on the original page.
Some application that could leverage such a framework would be
  • Dynamicall marking pages for automatic scraping, like dapper.net.
  • Adding more data to a page and saving it as our version
  • Tracking user mouse movements, studying user behaviour.
  • Walking users through sites
  • Colloborative browsing
The list goes on. There are a lot of sites that try to achieve the effect but without 100 % success. Here are some methods we can use to achieve near perfect pages, though they are fetched by our servers.

  • For URLs (img, script, src), instead of rewriting the fetched HTML, we may as well have a handler at the root of our domain that redirects to the actual source. In this case, if the URL is absolute, it works just fine; in case of relative URLs, they hit our server and are redirected
  • Insert a script at the top of the page that substitutes the XMLHTTPRequest.open to call our proxyUrl.
  • Use our script to rewrite a href attributes to have control over the target pages.
  • Use our script to rewirte targets of form submit.
  • Send cookies set in the browser to our proxy for them to be relayed by the proxy to the actual server.
The idea is still rough and some of the elemets that are a problem include (but not limited to) Flash, top.location and window.location changes, script in the page tampering with the injected script, etc. A container like Caja (or Cajita) could come in handy to tinker with elements that have to be changed at the server side.
The idea is crude, but as I refine it, I would be posting updates.

Quick Analytics

As mentioned in an earlier post, sneak-o-scope is dead due to opensocial templates. However, the work done with sneak-o-scope could be used to build a simple analytics system. The system could be used for
  • Know how many of your twitter followers actually click a link you post
  • Track down that stalker who sends me disgusting scraps on Orkut
  • Know how many times my thread has been viewed in a forum
  • Tracking if a friend really clicked on a page send over the IM to him
  • Track how far an email travels, who view it, etc.
Hence, the requirements for the system should be
  • Require no registration, should be as simple as TinyUrl
  • Give a url that redirects to the original page, to an image or open a 1x1 transparent png.
  • Show easy to understand analytics
  • Send emails when someone is tracked
  • Allow to have the analytics private
I am still working on the UI to keep it as simple as possible, so if you have suggestions, please do write in. I would also appretiate a name for the project.

How Tynt works - Technical details

The idea of Tynt is simple and effective - whenever stuff is copied from your site, track it and insert a small link-back to your website. It may now be a 100% effective, but the idea would work for people who don't play around with the content that is copied. This post describes how they track stuff that is copied and some enhancements that I think would be useful.
Preventing text to be copied from a website has been around for ages. The onCopy method gives us a handler just before content is copied allowing the actual copied text to be manipulated. The tynt site requires us to insert a script that registers handlers. Here is the step by step explanation, once their script loads on our website.
  1. Register handlers for onCopy, onDrag, etc on the window Object
  2. Get a unique URL that will be used as a tracker
  3. When any of the registered event occurs
    1. Send an event to the server
    2. On firefox, create a new node with the data that has to be displayed with the content that is copied. Set selection to existing node and this new node.
    3. On IE, add extra text to the current Selection
    4. Cancel the propagation of the current event.
This works fine for most websites as not many use the on copy event. However, i find the text that is appended to the existing selection a little too obtrusively. Instead of adding such huge content, it would be easy if a simple image is included, with all the links on the image. The image still can be removed, but in my opinion, the chances that a simple small image would be ignored instead of such a huge block of text is high. The image can also serve as a pointer to the places where the copied content travels. There should also be some attribution text added when the client to which content is copied does not support rich text. Something simple in braces should do the job.
As for how the analysis was done, all I had to do was use Fiddler to load a formatted version of Tracer.js?user= to understand the code. Then I had to check out the eventHandlers that led me to handleTracing(). The function has a could of inline functions and one of them, called H() is responsible for replacing the text.
To summarize, the idea is nice but it is the javascript implementaion that stole my attention. :)

How Opensocial templates on Orkut killed Sneak-O-Scope

Hi,

A few months ago, I had posted a hack on Orkut opensocial that allows directed phishing attacks on Orkut accounts. The orkut team has acted upon it now and has disabled profile views on orkut. The current specification mandates the use of templates for the profile view. Templates with data pipelining would be fine for displaying information but would completly diable all forms of interaction.
This is a problem for the sneak-o-scope application as it relies on the fact that it can make a couple of javascript calls to record visit time, ip address, etc. With templates this is no longer possible. Without javascript, there would be no way to record the time when the user leaves a page. We can record enter time using requests [os:HTTPRequest] , but that would still not give us browser details like IP Address, user agent, etc. These requests are proxied through the Google URL Fetch bot. Adding images to the page also are replaced by the proxied version of the images.
The problem was with cross site script attacks on the applications. The template approach seems to be too limiting. A better stategy would be to use Caja to sanitize the applications better. Since the application runs in its own global object, the "top" javascript can be cajoled to disallow open frames. This would prevent the attack I published, but we will have to take a look at the other possible attacks.
So, as of now, Sneak-O-Scope is suspended. It can start working only when javascript is allowed on the profile page.

YAHOO Open Mail Application : Email Header

Hi,

A few days ago, I had written an article about the Yahoo Open Mail application that performed redaction of emails. This is a post about another such application that may prove useful.
There are a lot of websites like this, this and this that require us to paste the email header in the text box and they display the information in a much more readable format. Having a tool like this inside the YAHOO Open mail would be a lot more simpler.
Technically, the way to create it is simple. We have to attach an drag-drop listener handler that activates when an email is dropped onto the application. The function would require "full" details and hence, we would also get the header in the "part" property of the passed JSON object. Once that is done, its a matter of parsing the information and displaying the information, either in a new tab or in a pop up dialog box. An addition would be to use free ip to geolocation tools to determine the location of the hop servers and display them as a map. A furthur visualization could be actually animating the mail to give the user, an idea of the delay, etc.
I would love to get this out as quickly as possible, unfortunately, I seem to have lost my bouncer ID for the mail applications. If you know of any, please do let me know.

Search with your Google Custom Search Engine on the side

Hi,

I had earlier written about a custom google search engine that I had created that helped me look through my delicious bookmarks. The search engine was great, just that I did not have other features that Google was offering, things like definations, books, etc. It looks like the Google default search cannot be done away with, thats the reason I wrote this small greasemonkey script that shows the results from my search engine side by side of Google results.
The script itself is simple, it gets the search parameter and pages from the location, sets them in the Google CSE Url and makes an AJAX call. The results are put into a div thats placed next to the results div. All decorations except the search results itself are removed. That done, the sponsored results table is also blanked out to use that space for custom search results.
If you want to add your custom search engine, simply reaplce the first variable in the page with your value - this is the cx parameter passed when you search on your custom search engine.

While I am at it, I was also planning to convert this to a ubiquity extension.

Gmail Burglar Alarm - Back and Running

Hi,

You may have noticed that the Gmail Burglar Alarm could have stopped working some days ago. To get the application back working, simply, delete the calendars corresponding to the application.
The new calendars would be created once the application is activated again.
This was because of some issues with the calendar and the token that is used to access the calendar data. The issue seemed to spring up because of the way non-primary calendars were handled at Google Calendar. Deleting the calendar resets the token and this new token is used to add events to Google Calendar.
On a side note, I am also working on converting the plain and simple tabular data into better visualizations using Google Visualization API.

YAHOO Mail : Data Leakage Prevention

YAHOO mail has launched the ability to add applications to the web interface. This post is about an application that was presented at the YAHOO Hackday, Bangalore. The original hack is detailed in Babu Srithar's blog.



As shown in the video, the mail can be dragged onto the application icon to redact information. The prototype is crude in the sense that it uses encryption to achieve redaction. It also requires the user to identify sensitive information that has to be redacted. A production ready implementation could simply use industry standard data classification toolkits like Tablus and use it for identifying information. Using roles assigned to users, the redaction server can also ensure that information displayed to different depends on the roles assigned to them.
There is still some usability issues with having to drag and drop the mail to the application, but I think that YAHOO would overcome this issue and grant applications more capabilities.

Sneak-O-Scope : Release 4.0

Hi,

We just released a new release of Sneak-O-Scope with a cleaner database. Though there were not many cosmetic changes to the applications, the code base is a lot more managable now.

The first change was the addition of the app engine console to the backend. The console is a simple python shell that executes commands at the server. This was very useful to test the database and the limitations imposed by the Google App Engine.

The second most important change was that the enter and exit time columns in the database were converted to the python datetime format. It was a textProperty earlier and that led to invalid date and times.

Another change was the introduction of a table to separately store unknown visits. This was done as querying for unknown and known visits separately is not allowed. Since we sort on enterTime in descending order, we could not have an inequality for unknown visitor. Hence, we had to separate them out into two tables and query them separately. This also helped us to pick 100 known and 100 unknown visitors, populating the list at the UI with atleast some known visitors instead of all visitors being unknown.

We also used the Preferences table to aggregate data. The table now has a Text column that stores the JSON for total visitors, known visitors, etc. The field will later used to store browser and friend statistics also.

We also included the simplejson library to parse JSON to and from the requests. This is better than parsing the strings to get data out.

There are also a couple of migration scripts that are to port data from the database at sneak-o-scope.appspot.com to orkut-scope.appspot.com. This would be the vertical partioning of the database for users from orkut ,myspace, etc.

The IP address has now been put to use with a YAHOO pipe that displays the location, ISP, etc in an iFrame. Details about this on a later post.

Last but not the least, we also included a HTML page to indicate that the application is not installed properly and has to be moved above other applications. There is also inline text next to unknown visitors about Orkut's privacy policy that does not let us display details of visitors who do no have the application installed.

Moving Sneak-O-Scope above other applications

There were a lot of queries about the installation of Sneak-O-Scope and it not working properly. Almost all of them were because the application was installed but not expanded by default. The users would have to move the application above other application on their profile page. Here is a video of how it is to be done.


The accident

I had an accident yesterday and am totally immobile now. With just a laptop to keep me company, I created this graphic reproduction of how it happened.



This was done in swish.

Searching through my delicious bookmarks

Hi,

I had earlier written about a ubiquity command that allows bookmarking pages quickly into delicious. It had help from YAHOO pipes to generate tags automatically and submit to delicious using their API.
The only part missing was a neat way to search those bookmarks. Since the tags were auto generated, delicious did not do a good job to look at the page content. Also, I would depend on Google's search index looking at only these bookmarked pages. The Google Custom Search was the answer, and here is my custom search engine that does the job.
Creating the search engine was simple, all I had to do was to include my delicious feed into the makeAnnotations Url. This generates the annotations file automatically, updating it at intervals. We do not have to go through the skull drudgery of creating annotation files, including sites, etc.
The search seems to be useful and I created the A9 search XML and added it to my search bar using the Mycroft engine builder.
If you have interesting delicious feeds, do let me know so that i could add your sites also, above the default Google results returned. The only thing that the custom search page lacks is the Images, Videos tag for which I am writing a greasemonkey script.