Marc Najork

Marc joined Microsoft Research Silicon Valley in October 2001. He is currently working on link-based ranking algorithms for web search results. Past projects at Microsoft include heuristics for detecting spam web pages; PageTurner, a large-scale study of the evolution of web pages; and Boxwood, a distributed B-Tree system.

Before joining MSR, Marc spent 8 years at DEC's, then Compaq's (and now HP's) Systems Research Center. Projects at SRC included Mercator, a high-performance distributed web crawler; JCAT, a web-based algorithm animation system; and Obliq-3D, a scripting system for 3D animations.

Marc is the editor-in-chief of ACM TWEB and is co-chairing the news section of CACM. He served as conference chair of WSDM 2008 and program co-chair of WWW 2004.

Marc received a Ph.D. in Computer Science from UIUC for his work on Cube, a 3D visual programming language.

Publications

Issued Patents

  • Marc A. Najork. Changing number of machines running distributed hyperlink database. US patent 8,392,366, issued 3/5/2013.
  • Marc A. Najork. Incremental update scheme for hyperlink database. US patent 8,209,305, issued 6/26/2012.
  • Marc A. Najork, Dennis C. Fetterly, Mark S. Manasse, Alexandros Ntoulas. Using content analysis to detect spam web pages. US patent 7,962,510, issued 6/14/2011.
  • Marc A. Najork. Query dependant link-based ranking using authority scores. US patent 7,818,334, issued 10/19/2010.
  • Marc A. Najork. Query dependent link-based ranking. US patent 7,792,854, issued 9/7/2010.
  • Marc A. Najork. Deletion and compaction using versioned nodes. US patent 7,783,671, issued 8/24/2010.
  • Marc A. Najork. Systems and methods for ranking documents based upon structurally interrelated information. US patent 7,739,281, issued 6/15/2010.
  • Marc A. Najork. Systems and methods for inferring uniform resource locator (URL) normalization rules. US patent 7,680,785, issued 3/16/2010.
  • Marc A. Najork. Fault tolerance scheme for distributed hyperlink database. US Patent 7,627,777, issued 12/1/2009.
  • Marc A. Najork. System and method for maintaining a distributed database of hyperlinks. US Patent 7,340,467, issued 3/4/2008.
  • Marc A. Najork. System and method for distributed web crawling. US Patent 7,139,747, issued 11/21/2006.
  • Marc A. Najork and Chandramohan A. Thekkath. Algorithm for tree traversals using left links. US Patent 7,082,438, issued 7/25/2006.
  • Marc A. Najork and Chandramohan A. Thekkath. Deletion and compaction using versioned nodes. US Patent 7,072,904, issued 7/4/2006.
  • Marc A. Najork and Chandramohan A. Thekkath. Algorithm for tree traversals using left links. US Patent 7,007,027, issued 2/28/2006.
  • Marc A. Najork and Clark A. Heydon. System and method for efficient filtering of data set addresses in a web crawler. US Patent 6,952,730, issued 10/4/2005.
  • Marc A. Najork. System and method for identifying cloaked web servers. US Patent 6,910,077, issued 6/21/2005.
  • Marc A. Najork, Clark A. Heydon, Michael Mitzenmacher, and Monika H. Henzinger. System and method for near-uniform sampling of web page addresses. US Patent 6,594,694, issued 7/15/2003.
  • Marc A. Najork and Clark A. Heydon. Web crawler system using parallel queues for queing data sets having common address and concurrently downloading data associated with data set in each queue. US Patent 6,377,984, issued 4/23/2002.
  • Marc A. Najork and Clark A. Heydon. System and method for associating an extensible set of data with documents downloaded by a web crawler. US Patent 6,351,755, issued 2/26/2002.
  • Marc A. Najork and Clark A. Heydon. System and method for enforcing politeness while scheduling downloads in a web crawler. US Patent 6,321,265, issued 11/20/2001.
  • Marc A. Najork and Clark A. Heydon. System and method for efficient representation of data set addresses in a web crawler. US Patent 6,301,614, issued 10/9/2001.
  • Marc A. Najork, Clark A. Heydon, and Janet L. Wiener. Web crawler system using plurality of parallel priority level queues having distinct associated download priority levels for prioritizing document downloading and maintaining document freshness. US Patent 6,263,364, issued 7/17/2001.