Inter-American Development Bank
facebook
twitter
youtube
linkedin
instagram
Abierto al públicoBeyond BordersCaribbean Development TrendsCiudades SosteniblesEnergía para el FuturoEnfoque EducaciónFactor TrabajoGente SaludableGestión fiscalGobernarteIdeas MatterIdeas que CuentanIdeaçãoImpactoIndustrias CreativasLa Maleta AbiertaMoviliblogMás Allá de las FronterasNegocios SosteniblesPrimeros PasosPuntos sobre la iSeguridad CiudadanaSostenibilidadVolvamos a la fuente¿Y si hablamos de igualdad?Home
Citizen Security and Justice Creative Industries Development Effectiveness Early Childhood Development Education Energy Envirnment. Climate Change and Safeguards Fiscal policy and management Gender and Diversity Health Labor and pensions Open Knowledge Public management Science, Technology and Innovation  Trade and Regional Integration Urban Development and Housing Water and Sanitation
  • Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer

Abierto al público

  • HOME
    • About this blog 
    • Editorial guidelines
  • CATEGORIES
    • Knowledge Management
    • Open Data
    • Open Learning
    • Open Source
    • Open Systems
  • Authors
  • English
    • Español

How we used Natural Language Processing to connect people with knowledge through the FindIt platform

April 13, 2021 by Mónica Hernández - Kyle Strand Leave a Comment


7 min read.

If it’s difficult for a person to find the information they are looking for, imagine how complex it is to teach artificial intelligence algorithms to identify relevant information and deliver it when a user needs it! This is precisely the challenge we encountered when developing FindIt: an intelligent platform that, much like Netflix, brings knowledge generated by the IDB Group to its staff and external audiences. 

We faced two main challenges creating this platform: 

  • Offering content recommendations to our users proactively, requiring minimal effort and delivering information even before they ask for it. 
  • Recognizing the level of experience of colleagues based on data that the organization already had, to be able to recommend who to talk to about a certain topic. 

We are proud that we managed to overcome both challenges, and in this article we’ll tell you how. 

How do you teach algorithms to simulate intelligence? 

To start, think like a person! When you need to find information on a specific topic, you think about it in words and concepts, using natural language. So, for an algorithm, which is just a finite set of instructions, to have any chance of responding to a request with relevant suggestions, it has to learn to understand human language.  Even more difficult in this case, it also has to learn to understand the jargon that we use at the IDB. 

There are multiple current trends in implementing artificial intelligence, and natural language processing (NLP) is perhaps the most dynamic. NLP focuses precisely on understanding, interpreting, and manipulating human language, and in that vein, we applied 2 NLP methodologies to design 2 algorithm schools: one focused on Ontology, and another on Deep Learning. 

The Ontology Focused School 

The first lesson for our algorithms is Taxonomy which, in essence, is a hierarchical list of terms used to classify information into categories. What does a taxonomy look like? Imagine a tree structure where the main branches represent categories and the secondary branches subcategories. Here you can see an example:

Although complex, a taxonomy is insufficient for our algorithms to understand human natural language and have the ability to recommend relevant content to our audiences. Therefore, we created an Ontology, which is a sophisticated language model that contains a set of taxonomies, called classes, which represent families of concepts, and their relationship to each other. 

This diagram can be understood in multiple ways, here is one interpretation as an example: 

Author – Works for institutions 
Author – Writes content 
Content – Has one or more topics 
Topic – It is related to sectors 
Institutions – Belong to one or more sectors 
Content – Is related to countries and consequently to regions 

In the case of our algorithms, the advantage of an ontological model is that it teaches them language through concepts. These concepts belong to one or more well-established categories, and have attributes that describe their characteristics. When reviewing text, our algorithms can identify the language, know the definition of the terms they recognize, understand the synonyms of those terms, and consistently interpret jargon, dialects and languages. But more importantly, they understand the relationships between concepts allowing them to produce recommendations and answer complex questions such as: 

  1. What content has the IDB recently published on digital transformation in Latin America and the Caribbean? 
  2. Are we working on projects that use Drones? 
  3. What policies or actions have been proposed for the economic recovery of Small and Medium Enterprises in LAC after the pandemic? 

We scaled this process to the level of 80,000 digital resources by creating a Knowledge Graph containing the connections that allow our algorithms to make content recommendations and learn from their own experience. 

Interesting, but what is the result? Just like Amazon recommends products to you, FindIt and its algorithms infer that if a user visits a publication about initiatives to increase gender equality, they will surely be interested in other related resources and deliver them in the same interaction. Click on the following image to see an example live and start living the FindIt experience. 

The School Focused on Deep Learning 

In the field of artificial intelligence, deep learning is one of the areas that has truly increased our ability to create intelligent machines. At its core, deep learning is about using algorithms inspired by the structure and function of the human brain. These neural networks, as they are called, iteratively process large amounts of data to discover and infer connections between the data. In seconds, deep learning can perform a volume of analysis that would take a human being several months or even years. 

To apply this methodology in the school focused on deep learning, we gathered more than 2.1 billion words written in English and Spanish about the IDB Group’s work. These words came from sources as varied as publications, job descriptions, strategies, and project proposals. We analyzed that large amount of words with an algorithm that generates word embeddings, creating a model that reveals the relationships between concepts in multiple dimensions. It is important to emphasize that these associations reflect our jargon, our particular way of speaking in the institution, and not simply standard Spanish or English. As an example, the image below presents some interesting connections that the model returned: 

The word “agriculture” is related to “livestock”,” forestry “and” mining “, which is understandable, but the model also shows that the word “econometrics” is closely connected to” agriculture “, which makes sense in the context of the work we do.  “Agriculture” is also related to “agricultural”, which is close to the term “El Salvador”, where we support agricultural projects, which we call “operations”. “Operations” in turn, is connected to terms that we use internally to refer to our operational work at the IDB, terms such as “loans”, “TCs”, and “non-reimbursable funds”. This is an unsupervised process, which means that all the connections between terms are mapped by an algorithm, with no need for human curation, unlike the ontology-focused school that requires regular manual supervision. Although the graph above shows three examples, remember that the full model was created on a scale of more than 2 billion words. 

There are many potential uses for this language model that our jargon map reveals. In the case of FindIt, we used it to bring a new perspective to analyze text that the organization already had on its personnel to reveal evidence of their skills and experiences. The end result is a tacit knowledge locator, so to speak, that allows colleagues to easily and quickly connect with each other to answer a question, share relevant experience, or bring specific skills to a project or team. And it is all driven by that language model. 

Check out one of the results when you search for natural language processing. Good job FindIt! 

Complementarity: two models is better than one 

FindIt graduated from both schools: the one focused on ontology and the one focused on deep learning. The learning obtained has been applied to understand, classify, and organize digital resources, as well as to infer a person’s knowledge profile.  As a result, now, when faced with a specific request made in words, FindIt contextualizes and suggests relevant information from the IDB Group’s universe of knowledge. This ability to connect users with knowledge increases our capacity for collaboration, and generates greater reuse of knowledge, which takes us one step further on the path of digital transformation. 


By Kyle Strand, Senior Knowledge Management Specialist, and Monica Hernandez, consultant in the Knowledge and Learning Department of the IDB


Filed Under: Knowledge Management Tagged With: Natural Language Processing

Mónica Hernández

Mónica Hernández é consultora do Setor de Conhecimento, Inovação e Comunicação do BID. Em 2017, ingressou como Gerente de Projetos com a tarefa de liderar o desenvolvimento de uma solução que melhorasse a localização dos produtos de conhecimento produzidos pelo BID, com o uso de algoritmos de inteligência artificial e tecnologias semânticas. Monica é profissional de Ciência da Computação com especialização em Gestão de Tecnologia da Informação.

Kyle Strand

Kyle Strand is Lead Knowledge Management Specialist and Head of the Felipe Herrera Library in the Knowledge, Innovation and Communication Sector of the Inter-American Development Bank (IDB). For more than a decade, his work has focused on initiatives to improve access to knowledge both at the Bank and in the Latin American and Caribbean region. Kyle designed the first open repository of knowledge products at the IDB and spearheaded the idea of software as a knowledge product to be reused and adapted for development purposes, which led the IDB to become the first multilateral to formally recognize it as such. Currently, Kyle coordinates library services within the organization, supports the open knowledge product lifecycle including publications and open data, and promotes the use of artificial intelligence and natural language processing as a cornerstone of knowledge management in the digital age. Kyle is also executive editor of Abierto al Público, a blog in Spanish that promotes the opening and reuse of knowledge. He has a B.A. from the University of Michigan and an M.A. from the George Washington University.

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar

Follow Us

Subscribe

About this blog

Open knowledge can be described as information that is usable, reusable, and shareable without restrictions due to its legal and technological attributes, enabling access for anyone, anywhere, and at any time worldwide.

In the blog 'Abierto al Público,' we explore a wide range of topics, resources, and initiatives related to open knowledge on a global scale, with a specific focus on its impact on economic and social development in the Latin American and Caribbean region. Additionally, we highlight the Inter-American Development Bank's efforts to consistently disseminate actionable open knowledge generated by the organization.

Search

Topics

Access to Information Actionable Resources Artificial Intelligence BIDAcademy Big Data Citizen Participation Climate Change Code for Development Coronavirus Creative Commons Crowdsourcing Data Analysis Data Journalism Data Privacy Data Visualization Development projects Digital Badges Digital Economy Digital Inclusion Entrepreneurship Events Gender and Diversity Geospatial Data Hackathons How to Instructional Design Key Concepts Knowledge Products Lessons Learned Methodologies MOOC Most Read Natural Language Processing Numbers for Development Open Access Open Government Open Innovation Open Knowledge Open Science Solidarity Sustainable Development Goals Taxonomy Teamwork Text Analytics The Publication Station

Similar Posts

  • Applying topic modeling to knowledge management online
  • Trusted knowledge at your fingertips: generative AI powered search across our Publications Catalog
  • Natural Language Processing: A Keystone of Knowledge Management in the Digital Age
  • 10 Practical Resources to Strengthen Your Prompt Engineering Skills
  • Amplifying access to knowledge: evolving our Publications Catalog using generative artificial intelligence

Footer

Banco Interamericano de Desarrollo
facebook
twitter
youtube
youtube
youtube

    Blog posts written by Bank employees:

    Copyright © Inter-American Development Bank ("IDB"). This work is licensed under a Creative Commons IGO 3.0 Attribution-NonCommercial-NoDerivatives. (CC-IGO 3.0 BY-NC-ND) license and may be reproduced with attribution to the IDB and for any non-commercial purpose. No derivative work is allowed. Any dispute related to the use of the works of the IDB that cannot be settled amicably shall be submitted to arbitration pursuant to the UNCITRAL rules. The use of the IDB's name for any purpose other than for attribution, and the use of IDB's logo shall be subject to a separate written license agreement between the IDB and the user and is not authorized as part of this CC- IGO license. Note that link provided above includes additional terms and conditions of the license.


    For blogs written by external parties:

    For questions concerning copyright for authors that are not IADB employees please complete the contact form for this blog.

    The opinions expressed in this blog are those of the authors and do not necessarily reflect the views of the IDB, its Board of Directors, or the countries they represent.

    Attribution: in addition to giving attribution to the respective author and copyright owner, as appropriate, we would appreciate if you could include a link that remits back the IDB Blogs website.



    Privacy Policy

    Copyright © 2025 · Magazine Pro on Genesis Framework · WordPress · Log in

    Banco Interamericano de Desarrollo

    Aviso Legal

    Las opiniones expresadas en estos blogs son las de los autores y no necesariamente reflejan las opiniones del Banco Interamericano de Desarrollo, sus directivas, la Asamblea de Gobernadores o sus países miembros.

    facebook
    twitter
    youtube
    This site uses cookies to optimize functionality and give you the best possible experience. If you continue to navigate this website beyond this page, cookies will be placed on your browser.
    To learn more about cookies, click here
    x
    Manage consent

    Privacy Overview

    This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
    Necessary
    Always Enabled
    Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
    Non-necessary
    Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
    SAVE & ACCEPT