Facebook AI Researchers Discuss Similarity Search at 2020 Milvus Community Conference

Milvus
3 min readNov 9, 2020

--

Keynote speakers addressed developing Faiss, applications of similarity search, and more.

BEIJING, Oct. 17, 2020 — Data science software company Zilliz, primary contributor to the open-source embeddings similarity search project Milvus, hosted the first annual Milvus Community Conference last week in Beijing at the Wanda Vista Hotel. Over 200 people attended the event in-person, and more than 5,000 joined virtually.

Event speakers included Matthijs Douze and Jeff Johnson, Research Scientists at Facebook AI; Charles Xie, Founder & CEO of Zilliz; and others. Presentations and Q&A sessions covered topics spanning the development of FAISS, applications of vector similarity search, solving unstructured data challenges with Milvus, and more.

2020 Milvus Community Conference attendees.

Facebook AI researchers discuss similarity search and developing Faiss

Matthijs Douze and Jeff Johnson, researchers at Facebook AI, gave a joint conference session remotely. Facebook AI is part of Facebook’s larger machine learning group and operates as a distributed team with offices across the U.S. and Europe. The group researches topics including computer vision, conversational AI, natural language processing (NLP), recommendation engines, and more.

Douze and Johnson are currently the primary developers of Facebook AI Similarity Search (Faiss). Douze specializes in similarity search, embeddings, and unsupervised learning, while Johnson is an expert in low-level implementation of machine learning methods and optimizing AI algorithms for hardware efficiency. Johnson authored most of the original GPU backend for PyTorch, the primary neural network training library used and developed at Facebook.

Matthijs Douze and Jeff Johnson, Research Scientists at Facebook AI.

Although not formally a research category at Facebook AI, similarity search has become more relevant as data science shifts from analyzing structured databases to embeddings. Increasingly image, video, audio, text, and more data types are being converted into embeddings, or fixed-sized vectors with around 100 to 1,000 dimensions, then analyzed using machine learning. Converting raw data into embeddings enables analysis of massive, trillion-vector scale datasets that would otherwise be virtually impossible.

Faiss began as an internal C++ library at Facebook but has since become a primary tool for indexing vectors after it was released under an open-source license in 2017. As authors of Faiss, Douze, and Johnson spoke about the tool’s functionality, ideal scenarios for using the CPU and GPU versions of Faiss, how to balance tradeoffs that come with similarity search, and more. For additional information, watch the full presentation on Faiss from the 2020 Milvus Community Conference.

Zilliz talks building Milvus, an embeddings similarity search engine powered by Faiss

Charles Xie, Founder and CEO of Zilliz, attended the conference alongside the company’s core R&D team. Zilliz is the primary contributor to Milvus, an open-source vector similarity search engine powered by Faiss and currently an incubation-stage project at the LF AI Foundation. Xie spoke about the project’s evolution, releasing Milvus under an open-source license in 2019, and future development plans.

Charles Xie, Founder and CEO of Zilliz.

Hai Jin, Director of Research and Development at Zilliz, discussed the unique functionality Milvus builds on top of Faiss. According to Jin, as a cloud-native application with support for different indexes (HNSW, Annoy, etc.), more efficient resource utilization, and other database management optimizations, Milvus is a more powerful and adaptable tool than Faiss on its own.

Hai Jin, Director of Research and Development at Zilliz.

Watch the 2020 Milvus Community Conference in full

Additional speakers presented real-world applications of using Milvus to make sense of unstructured data. This included building a video recommendation engine at leading Chinese video streaming site iQIYI, creating a registered trademark search engine at Beijing-based online housing platform Beike, and more. Watch the 2020 Milvus Community Conference in its entirety to see these sessions and learn how vector similarity search solves modern big data challenges.

--

--

Milvus
Milvus

Written by Milvus

Open-source Vector Database Powering AI Applications. #SimilaritySearch #Embeddings #MachineLearning

No responses yet