-
Provides integrated management of large volumes of pre-training data and learned data by various domains
- Upload/download JSON files
- Data pre-processing
- Data statistics processing
- Data sampling
-
The LANGUAGE STUDIO makes it easy to build domain-specific language model thanks to its user-friendliness and reduced complex coding. Create custom language models for various fields such as finance or law, for both public and private institutions.
LANGUAGE STUDIO lets anyone create large language models that can be easily optimized for each domain.
on deep learning
application
learning models
GUI-based language
Create your own model using specialized language templates pre-trained with jargon gathered from large domain-specific learning data. Our large-scale language template models can be used for a variety of natural language processing.
High-quality natural language processing is achievable with the use of the most recent machine learning and deep learning (artificial neural network) technologies, which provide faster and greater performance than the existing techniques.
By using dictionaries and rules that are specific to each domain in addition to common dictionaries, it offers language models according to language characteristics of various fields and allows functions to create vast volumes of learning data individually.
BERT, ELECTRA, and RoBERTa are just a few of the models Saltlux offers, along with a proven model with up to 350 million parameters. Saltlux also give an ultra-large language model that has been trained with over 100G of data in order to create a domain-specific language model.
Through pre-learning model-based transfer learning, our technology learns various tasks (text classification, sentence embedding, named entity recognition, morpheme analysis), and evaluates quality using our template models in order to provide the optimal, best performing model for your needs.
It categorizes text entered into a predefined class, allowing you to arrange everything from simple sentences to complex documents.
With the sentence embedding method that not only understands the meaning of a sentence, it also finds similar sentences.
It provides a user-defined object recognition model from input text, thus enabling information retrieval and object name recognition for interactive systems that requires high accuracy.
It provides analytic results in 'morphemes', the smallest semantic unit of phrases, allowing the restoration of adjectives and verbs to their original forms, and directly modifying the outcomes using lexicographic functions.
It categorizes the emotions and sentiments of text, providing a model that categorizes positive/negative sentences and the emotional state of the counterpart.
It analyzes the meaning of the sentence, classify its intent and offers a specialized model that understands user intention in conversations.
LANGUAGE STUDIO provides key features for text service implementation, from the development of learning models to their deployment and management.
Learning data management
Model learning
Model arrangement
Model management
Labeling tool
Provides integrated management of large volumes of pre-training data and learned data by various domains
GUI-based model learning
Integrated management from model creation, arrangement to application
Web-based annotation
LANGUAGE STUDIO creates new value by converging with various solutions.
Established a comprehensive knowledge management system for the overall technology process, network, and technology analysis trends for KMS information and opinion that are registered internally.
Internal and external data collecting systems for each subsidiary were built using the VOC and big data analysis system, and via the analysis of VOC, a CS management support and customer support monitoring system were established.
Project to enhance the system that makes data production relatively easy by refining the platform which has vast amount of news data
A simple semantic based precedent retrieval method using daily terminology
Knowledge management system with extensive R&D trend signal detection
Monitoring of outbound incomplete sales and consultation analysis report
Analysis of positive/negative feedback on call center consultation
Customer VOC analysis, reports and insights on KT telecommunication products
Continuous monitoring system for customer complaints through atypical VOC analysis
15,000 hours of learning data, transcription of 2.5 million sentences of regional dialect
Transcription corpus by recruiting more than 2,000 speakers from each region
Learning data from in-depth expertise interviews covering more than 2,000 hours
5.5 million separate words of spoken language in the corpus of the foundational study on contemporary language.
Online colloquial learning data based on a total of 2 billion words of web data
15,000 hours of broadcast conversations, establishing about 15.4 million raw corpus