DEV Community

artydev
artydev

Posted on

1

Format summary text with Spacy

Here is the solution proposed by Groq :

import spacy
from spacy import displacy
from spacy.util import minibatch, compounding

# Load the French language model
nlp = spacy.load("fr_core_news_sm")

# Process the text
doc = nlp("Leurs contrôles, dans le cadre du dispositif anti-inflation, ont permis de veiller à ce que les engagements des industriels et distributeurs soient tenus. ...")

# Extract the sentences
sentences = [sent for sent in doc.sents]

# Correct the orthography
orthography_corrected_sentences = []
for sentence in sentences:
    orthography_corrected_sentence = sentence.text
    orthography_corrected_sentence = orthography_corrected_sentence.replace("«", "\"").replace("»", "\"").replace("â", "a").replace("ê", "e").replace("ô", "o")
    orthography_corrected_sentences.append(orthography_corrected_sentence)

# Format the sentences as a bullet list
bullet_list = []
for sentence in orthography_corrected_sentences:
    bullet_list.append("- " + sentence.strip())

print("\n".join(bullet_list))
Enter fullscreen mode Exit fullscreen mode

Sentry image

Hands-on debugging session: instrument, monitor, and fix

Join Lazar for a hands-on session where you’ll build it, break it, debug it, and fix it. You’ll set up Sentry, track errors, use Session Replay and Tracing, and leverage some good ol’ AI to find and fix issues fast.

RSVP here →

Top comments (0)

Image of Timescale

Timescale – the developer's data platform for modern apps, built on PostgreSQL

Timescale Cloud is PostgreSQL optimized for speed, scale, and performance. Over 3 million IoT, AI, crypto, and dev tool apps are powered by Timescale. Try it free today! No credit card required.

Try free

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay