Source code for SIFT, ORB, FAST and FFME for OpenCV C++ for egomotion estimation

Hi everybody!

This time I bring some material about local feature point detection, description and matching. I was wondering which method should I use for egomotion estimation in on-board applications, so I decided to make a (simple) comparison between some methods I have at hand.

The methods I’ve tested are:

  • SIFT (OpenCV 2.x C++ implementation, included in the nonfree module): The other day I read a demonstration that stated that SIFT is the best generic method, and that any other approach will only outperform it in terms of speed or particularly at some specific function. Nevertheless, you might know that SIFT is patented, so if you plan to use it in your commercial applications, prepare yourself to pay (visit Lowe’s website for more details http://www.cs.ubc.ca/~lowe/).
  • FAST (OpenCV 2.x C++ implementation): I’ve selected this one due to its ability to provide a lot of features in short time. This is probably the faster method for local feature detection. Combined with BRIEF, it can give pretty nice results. Some modifications of it can be found in the author’s website http://www.edwardrosten.com/work/fast.html .
  • ORB (OpenCV 2.x C++ implementation): As an alternative to FAST, ORB (Oriented BRIEF) appears as a natural extension, which provides invariance to rotation (FAST-BRIEF does not). It’s a little bit slower that FAST-BRIEF, but gives nice results.
  • FFME: This method is a SIFT-like one, but specifically designed for egomotion computation. The key idea is that it avoids some of the steps SIFT gives, so that it runs faster, at the cost of not being so robust against scaling. The good news is that in egomotion estimation the scaling is not so critical as in registration applications, where SIFT should be selected. You can find more details, the paper and so on at the author’s website https://sites.google.com/site/compvis/home

Personally, I prefer not so many features, but very distinctive ones, but, at the same time I like algorithms to run fast. So I would choose SIFT for its robustness, but not use it for being quite slow. Definitely I will not use it if I had to pay (remember, SIFT is nonfree!). I’ve been using FAST-BRIEF and ORB for some time, with good results, but some times they give me too many outliers, so I have to construct RANSAC layers to filter them out.

Comparison of SIFT, FAST-BRIEF, ORB and FFME

My choice is, therefore FFME. It is almost as robust as SIFT, and slightly faster. Last but not least, it is free. It gives pretty distributed detections (as opposed to FAST or ORB that sometimes give too much features in very small regions), which is very good for egomotion estimation.

You can see a video of its performance in a sequence I’ve recorder with my car (see below to get the source code):

VIDEO-YOUTUBE

FFME is a creation of Carlos Roberto del Blanco (http://www.gti.ssr.upm.es/~cda), who provided the source code freely. I have created a C++ wrapper that can be used with the OpenCV 2.x API, as well as the required modifications to compile FFME with CMake.

SOURCE CODE

Hope you like it!! Ah, please, if you finally publish something and use FFME, don’t forget to add a reference to the great work of Carlos Roberto:

C.R. del Blanco, F. Jaureguizar, L. Salgado, N. García, “Motion estimation through efficient matching of a reduced number of reliable singular points”, SPIE Real-Time Image Processing 2008, San Jose (CA), USA, SPIE vol. 6811, pp. 68110N-1-12, 28-29 Jan. 2008. (DOI 10.1117/12.768125)

Bye!

About these ads
This entry was posted in Computer vision, OpenCV and tagged , , , , , , , , . Bookmark the permalink.

39 Responses to Source code for SIFT, ORB, FAST and FFME for OpenCV C++ for egomotion estimation

  1. Kishore Kumar says:

    Hello Mr. Marcos Nieto,

    Thanks for sharing your works and also for your blog. I’m doing a project based on indoor navigation. What algorithm do you think suits better when the robot i’m building needs to navigate in closed path like in factories and offices and also avoid people and objects?

    Thank you.

    • Hi! Thanks to you!
      Regarding your project, it heavily depends on the expected result. On the one hand, laser-based sensors can provide the better results in such environments, but their high price (about 3000€+ good ones) make them typically unavailable. In that sense, vision systems can be a very good substitute, especially in indoor environments where the light is more controlled. In such case, I would use stereo vision, since two cameras can provide a solution for depth estimation, while single-camera systems need motion to determine the position.
      In any case, you will probably need to use bundle-adjustment like algorithms to compute simultaneously the position and mapping.
      You can find a lot of information elsewhere googling SLAM (simultaneous localisation and mapping).
      Good luck!
      Marcos

  2. Kishore Kumar says:

    Do you think this course in Udacity would be helpful? Mr. Sebastian Thrun teaches it for making AI self driving cars and i see that he uses SLAM. http://www.udacity.com/overview/Course/cs373/CourseRev/apr2012
    Or is there anyother place i can get Computer Vision based resources i can get for doing this project? I’m doing this project for my Undergrad finals and i want to do my Masters in Optics and Photonics.
    Thank you.

    • Yes, definitely that course worths to be watched.
      I recommend you too to check http:://openslam.org where you can find a pretty large number of open source code for SLAM. There you can search for more resources and samples.
      BR!

  3. ettogawa says:

    can you post link source code for ORB?

  4. Hi! Do you use Twitter? I’d like to follow you if that would be ok. I’m absolutely enjoying your blog and look
    forward to new updates.

  5. I really like looking through an article that will make people think.
    Also, thank you for permitting me to comment!

  6. Etto Gawa says:

    How many OpenCV library function call On an application made (which is related to the keypoint detection algorithm for FAST and ORB). Examples of each of its coding to call such functions

  7. Dmytro Dragan says:

    Good day Mr. Marcos Nieto,

    I really excited about your work :-) And I wonder how you choose the best set of parameters for every algorithm (thresholds, kernel sizes etc) for your test video?
    Thank you.

    • Hi!
      Sorry I’ve been away for a quite long time…
      Thanks for your comment. Actually selecting parameters is mostly of times solved with a try-and-error approach : )
      However, a more dedicated effort should be done, and there are methods for automatic parameter finding, such as EM, or even MLE approaches.
      I admit that most of times I don’t use them, except when I was doing my PhD.
      Best regards,

      Marcos

  8. Ibra says:

    Hey marcos, i have to compare a BRIEF and BRISK against SIFT,can u please tell me which one is better BRIEF or BRISK and why? i have a pair of image?

    • BRISK is probably a good default option, although ir really depends on the application and the transforms that may suffer the images.
      I can suggest you to take a look to the latest paper a colleague of mine did about this topic (see I. Barandiaran, M. Graña, and M. Nieto, “An empirical evaluation of interest point detectors,” Cybernetics and Systems: An International Journal, vol. 44, no. 2-3, pp. 98-117, 2013 (DOI: 10.1080/01969722.2013.762232)). He has just finished his PhD about this topic. I will soon post the awesome evaluation framework he has created with geometric and photometric transforms… In the meanwhile you can find a beta website with the material at http://www.vicomtech.tv/keypoints
      Best regards

  9. Ibra says:

    thanks for the reply….i want to know how brisk is better than brief.i have implemented brisk using opencv and then match the features using Flann but i have got the false result.what should be the criteria when matching binary features?..and brief is only the descriptor extractor and brisk is both feature detector and extractor?

  10. Ibra says:

    i want to know that i have 2 images and diff between them is their viewpoint.i think brief is better in this context? what do u think??

    • BRIEF is very fast, but not robust against severe geometric transforms. ORB (Oriented BRIEF) would do the job in such situations.
      In my mind FAST-BRIEF is for small motion (such as egomotion), while ORB-ORB or other (SURF) for tracking, mosaicing and things like that.

  11. Ibra says:

    i have two choices BRIEF and BRISK?

  12. Ibra says:

    i have used BRISK+BRIEF in one code and alone BRISK in another and use the same matching criteria..and BRISK+BRIEF combination was better than BRISK alone…( i am referring to BRISK detector and BRIEF descriptor) using opencv

  13. Ibra says:

    i have used BRISK+BRIEF and BRISK alone wid same matching criteria using opencv and results show that BRISK+BRIEF is better than BRISK alone.BRISK as a detector and BRIEF as a descriptor

  14. Etto Gawa says:

    HI marcos, is your site computer-vision-talks.com down?

  15. Kevin says:

    Would you please reupload the modified code. The site http://marcosnieto.net/#Code seems offline at the moment. Thank you.

  16. Suleyman says:

    Hi Marcos
    Sorry for naive question. I am new in this field.
    I am also trying to find metrics to compare feature extraction methods for automotive domain for object detection and tracking, particularly in moving conditions. Is your comparison also valid for moving object like padestrians, vehiches..?

    • Hi! For detection and tracking it is probably better using region descriptors rather than point descriptors. I mean HOG (histogram of oriented gradients), HOOF (histogram of oriented optical flow), color histograms, etc, and then use a machine learning algorithm like SVM or Adaboost to train a classifier on a dataset of the objects you want to search.
      Best regards!

      • Suleyman says:

        Hi Thank you very much. I will be very helpfull.
        I have read “Local invariant feature detectors: A Survey”. What other sources do you suggest?
        Regards,
        Suleyman

  17. fxjapan says:

    Hi Marcos,

    I also came across an algorithm by the same people who implemented BRIEF- called as DBRIEF.
    My aim is to match a single query image against 100 train images.
    For such an experiment, is BRIEF or ORB better?
    Thanks.

  18. fxjapan says:

    Hi Marcos,

    I also came across an algorithm by the same people who implemented BRIEF- called as DBRIEF.
    My aim is to match a single query image against 100 train images.
    For such an experiment, is DBRIEF or ORB better?
    Any comparisons between BRIEF and ORB could also be helpful.
    Thanks.

    • Hi!
      I don’t know about DBRIEF, but the main difference between BRIEF and ORB is that ORB is an improved version of BRIEF as it contains “ORientatoin” information. This way, ORB descriptors can be matched even when rotation transforms have happened between images. For your experiment: check if there can be rotation between your query image and the train images. In case not, I would use BRIEF for being faster.
      Best regards!

  19. Pixel says:

    Hi,
    ///////// Algorithm for Identifying objects in human hand ////////////
    End solution : Identify objects on human hand
    Objective is to detect object in human hand , i tried using Opencv Haar classifier but it is not detecting when hand has any object and detected when the Hand is empty.
    If we trained for specific objects also, what our algorithm would do if human hand has other different objects ?? Doubt :(
    Now my approach would be finding the Face ( using Opencv ) and then human upper body ( using opencv classifers ) and please Suggest some algorithm / approach to find the hand position / blob
    Environment : Lighting Controlled environment / Indoor
    Camera : Frontal face camera ( Human upper body can be clearly view-able in camera Field of view )
    Camera Number : One
    Camera resolution : 640×480
    Possibilities scenarios: There could be more person on the FOV – Field of View
    For ex: Consider people standing in a Queue
    Experts: Which algorithm would be much effective in identifying object in human hand

    • Hi!
      To identify objects you probably first need to find the adequate features that better describe the object and then train a classifier that do the job (typically SVM por Adaboost).

      In general Haar and HOG features hace shown good performance for faces, upper bodies and full bodies. OpenCV has full support for this.

      Kind regards,

      Marcos

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s