Featured post

My experience at the Industrial Conference on Data Mining 2013

The Industrial Conference on Data Mining (ICDM) is a prestigious conference (rated as one of the top 25 data mining conferences in the world), where top researchers present theoretical and application oriented topics on Data Mining. This edition of the conference was held from July 16-21 at Newark, New Jersey. This was perceived as a unique opportunity by the team to present the work we had been doing in the space of data mining, since this conference was a perfect collaboration between academia (Harvard, Oxford, MIT ) and industry labs (Yahoo Labs, HP Labs, CERN to name a few who had published their work at this conference earlier).

Samit Paul from the Data Services team and myself, from the txtWeb team submitted a research paper on the work we had been doing over IntelliTips to the above mentioned conference, and the conference organizers accepted the submission as an Industry Paper! We were thrilled with this outcome and prepared the posters and material needed for the conference.

The conference session was declared open by Prof Petra Perner, Chair of the ICDM and a well renowned researcher from the University of Leipzig. The first session was an invited session by Prof. Alfred Inselberg and he talked about a revolutionary method to represent points in an n-dimensional plane by means of Parallel Coordinates.

Our presentation was slated for July 17 evening session, and it went off like a charm. The talk was very well received and there were lot of questions regarding the scale of the algorithms, since this was live for today’s txtWeb user base, which is easily more than 11 million! Lots of researchers were impressed with our vision to “Enable anyone with any mobile phone to create, discover and consume bite-sized services through a single global publishing platform”

There were a lot of questions regarding our future work, including whether this service would become available entirely over voice, hence eliminating the end user to even type out a single letter of text and make the service even more accessible. People were also interested to know about our business model and how we plan to take this to the next level.

Stop working (so hard)

If any of you are working weekends or pulling all nighters thinking ‘If I don’t do this, the other guy/company will crush me’, this is a very good read : https://medium.com/i-m-h-o/ef4772e3c628

Kyle Bragger in this article talks about how vacations are important and so are setting clear working hours. As he points out, it gives you more time for liking your work and spending time with family. This would also lead to the most productive days of your life. I have bought this idea and think it’s worth trying out :-)

How to open source your apps using GitHub

I am going to list a few steps which would help you create a GitHub repo and perform basic tasks using this :

Git is an open sourced version control system, and “it’s responsible for all the stuff that’ GitHub related and happens locally on your computer”.

  1. Download and install the latest version of Git
  2. Setup git on your machine
  3. Set your username and email as follows, using a terminal:
Set Username (This is the default username used when you commit to git):
$ git config –global user.name “your name”
Set email (This is the default email used when you commit to git):
$git config –global user.email “me@mydomain.com”

Oh yes,  you can skip all of the above steps and get the native GitHub app instead :

 That’s it ! You have set up Git and Github! Next : How to create a repo?
Okay, first:  what is a GitHub repository?
A GitHub repo is like a store where all the data you commit will be stored.

4. In the user at the top right of your GitHub page, click on create a new repo button.

5. Select the account you want to create the repo on.

6. Enter a name for the repository, and click on “Create repository”.

Great! Now you know how to create a new repo. Let’s now learn how to push changes to your repo:

7. CD into your project

8. Type the following command to initialize a git repository on your local machine:

$ git init

9. To see what state your project is in currently, run :
$git status

10. Create a new file in the directory, say test.txt and then run git status command again. You will see that git says “untracked files present”

11.  You need to inform Git to start tracking changes made to test.txt, and  add it to the staging area by using:

$ git add test.txt

12. Now the files are in the staging area, but not yet in the Git repo. To store the changes we have made, we run the commit command with a comment describing what was changed:

$git commit -m “Added a test file”

13. To view a log of all commits:

$ git log

14. Now all the commits made are only to the local repository. How do we push it to the server, so that others can see/download it?

Consider the repository to be repo-test

$ git remote add origin https://github.com/username/repo-test.git

and finally push your changes to the server:

$ git push -u origin master

How about contributing to a new project or using somebody else’s code as a starting point for yours?

This is called forking a project, and can be done as follows: Let’s say you want to contribute to txtWeb-Wikipedia project:

15. Click on fork button, as shown in the screenshot below:
Screen Shot 2013-12-08 at 5.28.32 PM

16. Clone your fork to your local machine, so that all the code associated with this project is available on your local machine and you can modify it, using the following command:

$ git clone https://github.com/username/Wikipedia.git

17. Configure remotes:

In order to keep track of the original repo you forked from, you need to add another remote named upstream:

$ cd /path/to/your/wikipedia

$ git remote add upstream  https://github.com/txtWeb/Wikipedia.git

$ git fetch upstream

You could also push commits to any repo using the following command:

$ git push origin master

18. A list of helpful links for learning other GitHub commands:

http://try.github.com

http://help.github.com

 http://learn.github.com

Google’s uncut and annotated Search Quality Review Meeting

This video is the search quality review meeting at Google, yes they publicly released a video of it. This particular “Weekly Review Meeting” was to review spelling correction for long queries. The amount of thought which goes into every single change at Google, the depth of the discussions taking place there and the profiles of the people who were talking just blew me off.

This is a great watch for anyone who has ever wondered how Google Search manages to get it right every single time!