Oh, I created a search engine for the pages of The Saturday Evening Post magazine. For now, I have just indexed the first 30 pages of the 1971 issue just to check how it works. I only have scans. That is, my program extracts text from the scans, breaks it down into sentences (1384 sentences), and places them in an index. For each sentence, a vector is built, as well as for the query, and then the sentence that most reflects the query is displayed.
See how it works.
Search query: “how many birds are on the water among the trees?”
System response: “Six majestic Canadian geese float on the surface of a small pond, hidden deep in a safe forest, resting before the last leg of their flight to their northern home.”
Question: “Where is Rockwell’s studio located?”
System response: “Rockwell’s studio is located behind his house in Stockbridge, Massachusetts”


