Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Do you have any tips on how to effectively parse website content? I tested it on one of my websites and it was able to answer questions based on content that was located in separate div/p containers. Do you divide content into different section and use embedding to find the relevant text, or do you feed the entire page content into the API?


BeautifulSoup seems to work well for parsing. For your other question: something like that!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: