Spaces:
Running
Hi! Introduce yourself! 👋
We're so excited to see so much interest in the Journalists on Hugging Face community! To start, we'd love to learn a bit more from you: i.e., what you'd like to see more of on this page, what aspects of AI journalism interest you, and what projects you're working on if you'd like to share. 🤗
Hi! Thank you for this great initiative. It would be wonderful to begin a crowd-sourced curation of published journalistic stories or projects using AI!
Would love to understand better what we can provide as tools to journalists
I'd like to explore auto tagging articles with our taxonomy so writers can spend less time dealing with that. No one wants to slog through a list of taxonomy terms they need to click when they’re on deadline.
A personal community service hobby project I’m working on is a RAG application for city government meeting agendas and minutes. The retrieval is more complicated than other RAGs I’ve built because the metadata really matters. And I want it to include links to both source document chunks and the full documents.
Agents seem potentially useful. Could they help with article research?
@smach Interesting use cases! About tagging, in my previous job, I fine-tuned a model for a slightly different task (categorization) with Autotrain, and it worked like a charm. If you have a dataset with previous articles and the associated keywords, I'm sure you could obtain very interesting results.
There is a no-code interface here https://huggingface.co/autotrain and the doc https://huggingface.co/docs/autotrain/en/index. This tutorial could also be helpful, though a little bit old: https://www.youtube.com/watch?v=OH_e0wOkpZc
About your other project, have you seen this: https://www.vikramoberoi.com/how-citymeetings-nyc-uses-ai-to-make-it-easy-to-navigate-city-council-meetings/ ?
Hi, I obtained a degree in Journalism and a post-graduation title in Machine Learning. I'm studying and developing language models fine-tuned to Brazilian Portuguese and training image models representing Brazilian culture (I haven't published them to HF yet, only on Civitai). I believe those initiatives can help the production of more content in Portuguese language and bring knowledge of Brazilian culture through more faithful representation on AI generated images.
My interesting in joining this org is to help build this bridge across languages and cultures.
@smach this an old project, but it could work for your use case: https://huggingface.co/spaces/pleonova/multi-label-summary-text
I would also recommend using SetFit, works like a charm!
LLMs are pretty good at giving you categories/topics and you could use those in your RAG pipeline, something I am doing currently.
@lucianosb Welcome to the community! I'm curious: is it easy to find datasets in Portuguese or tailored to the Brazilian culture?
Olá @lucianosb ! My city has a lot of residents who are originally from Brazil. It would be useful if some LLMs translated to Brazilian Portuguese well.
I work in broadcast television, and I'm excited to learn from everyone. Headed to the Local Media Consortium conference next week (https://event.localmediaconsortium.com/). Will tell everyone about this group. I keep a blog with all of the links I see every week, if anyone's interested. It's purely a labor of love to keep me conversant.
https://ethanbholland.com/. Thanks for including me.
Hi Everyone, I'm exploring some use case for Medias, with already a first PoC product piloted in some events with 200+ attendees - would be super interested to hear more about Journalists needs so feel free to reach out if interested to have a chat !
I'd like to explore auto tagging articles with our taxonomy so writers can spend less time dealing with that. No one wants to slog through a list of taxonomy terms they need to click when they’re on deadline.
I am also very interested in automatic tagging. Apart from articles, I would like to use speech-to-text transcripts to tag radio and TV programmes according to a standard taxonomy: https://iptc.org/standards/media-topics/
I find it hard to figure out what is the best way to handle such a multi-label problem with quite a lot of hierarchically structured classes. Especially when it comes to tagging non-English content (in our case mostly German).
@lucianosb Welcome to the community! I'm curious: is it easy to find datasets in Portuguese or tailored to the Brazilian culture?
I believe there is still room for improvement in Portuguese based datasets. There has been a lot more datasets available since the LLM hype peaked and it allowed for a lot of fine-tuned models as you can see on the Open PT LLM Leaderboard
@constantinSch , I discussed this with the team, and thought you might find this resource useful for training your own model:
https://github.com/huggingface/transformers/tree/main/examples/pytorch/audio-classification
To give you a point of reference for the amount of training data needed, the Keyword Spotting (KWS) task uses about 25 hours of labeled audio data.
Looks like my comments and I are not welcome in this group and I have been removed from it. I wish someone would have reached out if I violated your group policies.
Btw, there might be a bug where people who are not part of this group can still post messages.
My best wishes to you all.
Hello, I'm Camilo from Colombia. I’m sharing with you this all-in-one news generator. It can process the most common sources of information used by journalists and write highly professional news articles. It will save journalists hours of work.
https://huggingface.co/spaces/CamiloVega/AI_News
Hi! I teach in the Journalism + Design program at The New School in New York. I joined this group a while ago but hadn't used any of these tools until I attended @fdaudens ' session today at Media Party. I am eager to try more of these and use them with my students.
@CamiloVega I’m curious to know more! How do you use it? Feel free to share screenshots.
hi, all- I'm an old-timer journalist (started in 1990!) and made a small name for myself as the creator of Back-to-Iraq.com. I've covered tech off and on through the years and now write a weekly column about how jouurnalists can use AI for The Media Copilot on substack. Obviously very interested in seeing what everyone gets up to on here!