MyDLP Blog

easy, simple, open source data leakage prevention

Archive for the ‘Data Identification’ Category

Variety in DLP Filters

without comments

Nowadays, MyDLP is trying to adopt new predefined rule patterns to its filter collection. We are working on 3 new patterns.

  1. Canada SIN
  2. France INSEE
  3. UK NINO
We are eager to adopt these 3 new patterns until the major first release of MyDLP. If you have any other suggestions please contact us from mydlp[at]

Written by burak

October 28th, 2010 at 2:14 pm

New Bayesian Classifier Engine for MyDLP

without comments

Previously, we have developed a Bayesian Classifier Engine with Java because of Turkish NLP (zemberek) dependency. But, this engine was introducing us some difficulties in many areas such as distribution, performance and maintenance.

But, a week ago we have decide to develop a very simple Turkish NLP module for MyDLP. This was a good decision because zemberek was too developed for us :) . We weren’t using most of its features and for every request we have to push a big binary through a thrift bridge. Also, large memory footprint of Java process was a disadvantage.

And now, we are using bayeserl with our own very simple Turkish NLP module. Moreover, results are more accurate and performance is improved.

Try it, use it.

Any comments and questions are very welcome.

Written by kerem

September 11th, 2010 at 6:22 pm