Knowledge Discovery from Digital Information

Developing innovative technologies to process and analyze massive amounts of digital information - from terabyte to petabyte


The Web, data repositories, and digital libraries have become society’s core information resources, and potential knowledge is embedded in these large digital information stores. Effective use of massive data resources requires new methods for processing masses of poorly organized information into useful and actionable knowledge to address complex scientific and social problems. The Knowledge Discovery from Digital Information (KDDI) research cluster advances the design, development, and testing of computational methods and tools to discover and exploit the knowledge available in tera- and peta-byte scale information stores. KDDI supports basic and applied collaborative research in the following broad areas: information visualization, machine learning, data mining, natural language processing, Web archiving, digital curation, information retrieval, and image, audio, and video processing.

A synergistic confluence of distinguished, award-winning faculty and outstanding programs provides the foundation for the KDDI cluster, integrating computer science, information science, and library science approaches across colleges. Computer scientists in the College of Engineering bring considerable expertise to the team with pioneering research in data mining, information retrieval, geo-spatial databases and natural language processing. Consistently ranked among the top twenty graduate programs in the nation, the College of Information focuses on the intersection of people, information, and technology with special attention to the challenges of digital content management, video analysis, multi-lingual information access, and medical informatics, complemented by an extensive network of professionals at the forefront of their fields. The UNT Libraries is among an elite group of research institutes and national libraries internationally recognized for enterprising work in Web harvesting and archiving.  Featuring a robust, state-of-the-art digital archive infrastructure, the Libraries can access, store, manage and transfer large quantities of digital information to accommodate a variety of complex research needs. Combined, the interdisciplinary expertise and scale of this three-way collaborative will advance KDDI research, moving projects from development to adoption with the potential to address social, health, environmental, and scientific problems such as:

  • Processing, organization, and visualization of large scientific data sets
  • Information extraction from medical records and medical image retrieval
  • Understanding social interactions in large social networks
  • Defense intelligence, including automatic message processing and identification of patterns of suspicious behavior via text analysis
  • Enhanced access to large document collections via keywords and summaries
  • Support for behavioral studies through the identification of patterns and trends in user-generated data
  • Long-term management and preservation of high-value research data and other information stores

Discovering knowledge is a fundamental mission of a university, and UNT’s mission addresses explicitly the creation, integration, application, and dissemination of knowledge. The KDDI research cluster advances the idea that potential knowledge is embedded in very large digital information sources and its goal is to enable the creation, extraction, integration, and application of that potential knowledge.