Merge of data from different systems using Fuzzy Matching

A fuzzy matching was used to combine the data from the two different sources. A selection of fuzzy string-matching algorithms was tested, for example Jaro-Winkler Distance, Levenshtein distance, Soundex or cosine similarity. The open-source algorithms can be very efficient and there is a selection to choose from depending on the use case.

Continue ReadingMerge of data from different systems using Fuzzy Matching

It’s All About Data: The Training Methods of Deep Learning

Let’s say our customer needs structured, labeled images for an online tourism portal. The task for our AI model is therefore to recognize whether a picture is a bedroom, bathroom, spa area, restaurant, etc. Let’s take a look at the possible training methods.

Continue ReadingIt’s All About Data: The Training Methods of Deep Learning

What Data Scientists should know about Data Security

Every data scientist, data analyst or data engineer rarely works only with open data, but with internal company data that is of great importance for business success. All the more reason why these experts in data storage and analysis should always think about data security and observe certain rules and principles. In addition to the technical security of the data, legal security also plays a role.

Continue ReadingWhat Data Scientists should know about Data Security