Business

Development and Modeling of AI Learning Data

Business to create infinite value of data through data intelligence

  • #Data collection
  • #Annotation
  • #In-depth QA
  • #Knowledge graph
  • #Ontology
  • #Corpus building

Experience the cost-effective AI learning data and modeling service of Saltlux, which has collaborated with organizations including Hyundai Motor Company, Samsung Electronics, KT, SKT, and the National Institute of the Korean Language, as well as the Digital New Deal project.

Data collection, integration, conversion into training data, and quality control account for more than 60% of the expenditures associated with developing machine learning-based AI systems and services. The quantity and quality of training data, as well as the accuracy of categorization and tagging, have a significant impact on the performance and quality of AI models. In order to reduce costs and boost quality simultaneously, it is crucial to protect technologies, processes, and procedures.
As Korea's first AI-specialized and listed company, Saltlux has been advancing the development of large-scale learning data and optimization of machine learning models, ranging from natural language processing, speech and facial recognition to various image recognition, medical and biotechnology fields. We have companies dedicated to creating learning data and modeling at local and foreign businesses, as well as an automated construction process, construction tools, and quality inspection system.
Experience the cost-effective AI learning data and modeling service of Saltlux, which has collaborated with organizations including Hyundai Motor Company, Samsung Electronics, KT, SKT, and the National Institute of the Korean Language, as well as the Digital New Deal project.

What Makes Saltlux's AI Learning Data Development Business So Special?

  • 01

    AI learning data building process and automation tool

    The process of building learning data that Saltlux has accumulated and advanced over the past 20 years can be tailored and optimized according to data type and AI application. Saltlux's process provides a variety of (language, voice, video, autonomous driving, etc.) construction tools that can automate complex tasks such as large-scale data collection, integration, and collaborative annotation (labeling).

  • 02

    Crowd worker environment for building large data

    Not only does the process need to be automated, but it also has to be made simple enough for hundreds of human workers to effortlessly cooperate and produce and inspect data. Saltlux has secured its own “Crowworks” platform for crowd workers and is maximizing the efficiency of data construction work through it.

  • 03

    Experience and assets in building the largest learning data in Korea

    With more than 100,000 hours of voice recognition data transcription, 10 terabytes of self-driving learning data, a vast language model learning data set that includes all Korean dialects, and learning data for lung cancer diagnosis and biomarker discovery, Saltlux has the most experience in building AI learning data in Korea. It also has data assets available for recycling and transfer learning.

  • 04

    ML Ops, optimized AI models based on active learning

    With the help of Saltlux's Language Studio, Voice Studio, Talkbot Studio, Vision Studio, and Knowledge Studio, non-developers can produce AI learning data and machine learning models in a collaborative setting. In addition, ML Ops-based tools provide active learning to help reduce costs and time by up to 80% or more.

  • 05

    Quality assurance team and process

    It requires a lot of time and money to supervise the work of dozens or even hundreds of human workers to build large-scale learning data and inspect the quality of the output. The dedicated quality assurance team at Saltlux guarantees training data quality at over 99.9%, thanks to their significant knowledge and experience in data, AI, and quality technologies.

  • 06

    Maximizing cost-effectiveness through subsidiaries and partners

    Since 15 years ago, Saltlux has operated local subsidiaries in Vietnam and the United States, and it currently manages a partner network for data construction in 20 different countries. Through it, not only can the cost of data construction be significantly reduced, but it is also the best partner for clients looking to take their local businesses worldwide.

  • 01

    AI learning data building process and automation tool

    The process of building learning data that Saltlux has accumulated and advanced over the past 20 years can be tailored and optimized according to data type and AI application. Saltlux's process provides a variety of (language, voice, video, autonomous driving, etc.) construction tools that can automate complex tasks such as large-scale data collection, integration, and collaborative annotation (labeling).

  • 02

    Crowd worker environment for building large data

    Not only does the process need to be automated, but it also has to be made simple enough for hundreds of human workers to effortlessly cooperate and produce and inspect data. Saltlux has secured its own “Crowworks” platform for crowd workers and is maximizing the efficiency of data construction work through it.

  • 03

    Experience and assets in building the largest learning data in Korea

    With more than 100,000 hours of voice recognition data transcription, 10 terabytes of self-driving learning data, a vast language model learning data set that includes all Korean dialects, and learning data for lung cancer diagnosis and biomarker discovery, Saltlux has the most experience in building AI learning data in Korea. It also has data assets available for recycling and transfer learning.

  • 04

    ML Ops, optimized AI models based on active learning

    With the help of Saltlux's Language Studio, Voice Studio, Talkbot Studio, Vision Studio, and Knowledge Studio, non-developers can produce AI learning data and machine learning models in a collaborative setting. In addition, ML Ops-based tools provide active learning to help reduce costs and time by up to 80% or more.

  • 05

    Quality assurance team and process

    It requires a lot of time and money to supervise the work of dozens or even hundreds of human workers to build large-scale learning data and inspect the quality of the output. The dedicated quality assurance team at Saltlux guarantees training data quality at over 99.9%, thanks to their significant knowledge and experience in data, AI, and quality technologies.

  • 06

    Maximizing cost-effectiveness through subsidiaries and partners

    Since 15 years ago, Saltlux has operated local subsidiaries in Vietnam and the United States, and it currently manages a partner network for data construction in 20 different countries. Through it, not only can the cost of data construction be significantly reduced, but it is also the best partner for clients looking to take their local businesses worldwide.

Business field

  • Collection and purification of web/social data

    Collecting, extracting and analyzing millions of data per day from thousands of web/social data sources

    1. #Hyundai Motor
    2. #samsung Electronics
    3. #Ministry of National Defense
  • Voice recognition/synthetic data

    Data creation for voice recognition and synthesis by region, gender and age in more than 20 languages

    1. #KT
    2. #ETRI
  • Video/image data annotation

    DNN-based image and video recognition service and high-quality annotations for autonomous vehicles

    1. #Korea Tourism Organization
    2. #Busan Metropolitan City
  • Natural language processing corpus

    Construction of large-scale high-quality, multilingual corpus for in-depth natural language processing and semantic understanding

    1. #samsung Electronics
    2. #Korea press Foundation
    3. #Shinhan Bank
  • Q&A and dialog corpus

    Corpus for implementing in-depth Q&A systems and conversation engines based on Seq2Seq and IRQA

    1. #KT
    2. #samsung Electronics
    3. #Shinhan Bank
  • Knowledge graph and ontology

    Construction of AI customer consultation system, in-depth Q&A, knowledge base for NLU and semantic analysis

    1. #Nonghyup Bank
    2. #Shinhan Bank
    3. #samsung Electronics
  • Building multi-language auto-translation corpus

    Construction of multilingual parallel corpus for automatic translation engine based on translation memory and NMT

    1. #IBM
    2. #LG Electronics
  • Building learning data for sentiment analysis

    Learning data for generating a sentiment analysis model from social media and customer service inquiries and civil complaints

    1. #Hyundai Motor
    2. #Korea press Foundation
  • Collection, conversion, and integration of open data

    Collection, conversion, integration and LOD publishing of open data, including public data

    1. #Ministry of the Interior and Safety
    2. #Ministry of Science and ICT
  • Integration and analysis of spatial data

    RDF conversion and integration of spatial data, together with various data such as sensors, transportation, and tourism

    1. #Ministry of Land, Infrastructure and Transport
    2. #National Geopraphic
  • Curation of science and technology data

    Data extraction and conversion from graphs, tables, and explanatory texts of papers, patents, and reports

    1. #samsung Electronics
    2. #Korea Institute of Science and Technology
  • Transformation and integration of healthcare data

    Integration of medical data such as EMR and EHR, standard (SNOMED, etc.) conversion, graph mining

    1. #SNOMED
    2. #Severance Hospital
    3. #Catholic Medical Center

Reference

  • KMS analysis and knowledge-to-data

    Samsung Electronics

    Mosaic, a collective intelligence platform

  • Document centralization

    Hyundai Motor Group

    Hyundai Motor Group document centralization system

  • KT VOC system

    KT

    Customer VOC analysis, reports and insights on KT telecommunication products

  • R&D Data Science platform

    LG

    Analysis of new sensing technology trends through R&D data collection and cognitive analysis

  • Establishment of open data

    Ministry of Culture, Sports and Tourism National Library

    LOD-based valuable old newspaper platform

  • Establishment of open data

    Electronics and Telecommunications Research Institute (ETRI)

    AI open API/DATA service

  • Establishment of open data

    Korean Intellectual Property Office

    Customized IP-Biz information-sharing platform

  • Establishment of open data

    Korea Culture Information Service Agency (KCISA)

    Convergence and open DB for LOD-based cultural prosperity

  • LOD public platform

    Busan Metropolitan City

    Knowledge graph of Busan's cultural information data and a LOD disclosure platform

  • Social and knowledge data

    Busan Human Resources Development Institute

    Busan-related experts and intellectuals Integrated DB-based social and knowledge network platform

  • LOD public portal

    Korea Water Resources Corporation (K-water)

    Industrial ecosystem that enables private use of water resources, map data, and public LOD

  • Establishment of open data

    Gyeonggi Province

    LOD service for search and data utilization of Gyeonggi Provincial Office management data

  • GIS database

    National Geographic Information Institute

    GIS database ontology technology modeling and complex search service