June 22 2017, 01:28

Exploring the topic of NLP, I stumbled upon an interesting e-commerce search project – @[484147038414629:274:AddStructure] https://www.addstructure.com/ Queries in natural language, support for contextual search, transparent facets. Cool, well done guys. Looks like a breakthrough. But who are the clients?..

June 21 2017, 22:54

Today, I discovered OpenNLP—a Java toolkit for text processing based on machine learning methods. Spent the entire evening getting it to work with SOLR. Someone had started it, but it was feeble and did not work. I finally got it running properly. Now, when I search for a red IKEA table, I can show not just anything red from IKEA, but all tables, primarily red ones, secondly from IKEA, and then just tables. Without this tweak, one of these three options would lead, making it not so easy for the store owner to achieve any one strategy.

I’ll polish it up; an article is coming soon. Right now, I’m as pleased as punch.

June 19 2017, 10:35

If anything – I know the anonymous candidate X quite well, the guy is very sharp, and I highly recommend him to anyone building large e-commerce systems from scratch. Worked for many years in major e-commerce. Grab him (just in case: the photo is of Sergey, and the post is from him, and you should contact him, but remember, this isn’t about Sergey, who is also great, but about the anonymous candidate)

June 18 2017, 21:24

Published a comprehensive article about PDF generation on the server: libraries, approaches, methods, issues, and their solutions. A special case is when you need to generate a multi-page document like a contract. Created a prototype using document templates and merging.

This article has very little about hybris and the approach in general is quite universal, not dependent on the platform.

Also got a bit into the PDF format. My goodness, what it is… But they say that’s how it is at Adobe, and I haven’t even seen PSD yet.

https://hybrismart.com/2017/06/15/pdf-and-sap-hybris/

June 16 2017, 14:38

I did it! Creating a PDF on the server based on a template prepared in MS Word or MS Excel (probably in other software too – needs testing). Data from the database or a user form is inserted in place of tokens within the document. And yes, it turned out to be insanely difficult (but now I know a ton about PDFs).

There are several complications for which there is no neat solution yet. 1) only two monospaced fonts 2) of course, the text from the template does not flow onto the next page if the insertion is larger than the reserved space 3) the template needs to include invisible text with all the variety of characters [to enhance the PDF dictionary for that particular font] 4) Theoretically, in some new or old version, or due to some clever block placement or specific settings, Word may generate PDFs that my code does not understand. Theoretically.