This is a microservice APIof Content Lense, a project that aims at enabling publishers to easily gain insights into their content. This API determines n topics of the given article from a given list of topics.
Please note that this repository is part of the Content Lense Project and depends on the Content Lense API.
Start the container with
docker compose up
Use -d
to run it in the background.
To analyse an article send a post request to the /articles
endpoint as Content-Type: application/json
with the following stucture:
{
"body": "The entire article.",
"customTopics": ["Wirtschaft", "Fußball", "Politik", "Wissenschaft", "Geld"],
"totalTopics": 3
}
The return type looks like the following:
{
"topics": ["Wissenschaft", "Politik", "Geld"]
}
- Huggingfaces
Zero-Shot-Classification
(https://huggingface.co/docs/transformers/v4.15.0/en/main_classes/pipelines#transformers.ZeroShotClassificationPipeline) - Pretrained Model with German Dataset (https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-mnli-xnli?candidateLabels=politics%2C+economy%2C+entertainment%2C+environment&multiClass=false&text=Angela+Merkel+ist+eine+Politikerin+in+Deutschland+und+Vorsitzende+der+CDU)
- as Docker is not yet configured to use the GPU, it takes around 3 minutes to determine a topic for one article (30 topics, 700 words)
- we tested with GPU: this leads to less than a minute (obviously depending on the GPU instance)
- e.g. ~ 20 sec for NVIDIA Tesla M60
Media Tech Lab media-tech-lab
Cloud Creators GmbH cloud-creators