Data mining: auto-tagging rules with regular expressions 2018-04-17T12:10:06+00:00

Data-mining: auto-tagging rules and tags

Here you can find sets of auto-tagging rules and tags, that you can use to data-mine information by scraping your drives and matching the content of your files to patterns (based on regular expressions).

1) IMPORT THE AUTO-TAGGING RULES

The files listed here can be imported in your Tabbles database from the menu File > Tabbles databases’ > Import data from XML zipped (there is no undo, so make sure you know what you are importing! ūüôā )

You can of course create your own rules and edit or delete the existing ones. If you import the file twice, the rules and tags will not be duplicated (unless you have renamed them).

Import auto-tagging rules for data mining, with regular expressions

2) RUN THE RULES ON A DRIVE

To start scraping a drive, you have to right click on a drive in Tabbles, then click on  Folder menu and then on Apply auto-tagging rules.

Select a drive to scrape

Feedback/Requests/Suggestions: in the forum topic.

Sensitive Info #1

This is a first attempt to put together a list of auto-tagging rules to scrape and auto-tag. Most of the regex where found on  RegExLib. This list contains rules to match:

  • Name of all the danish cities with more than 2500 inhabitants (around 1000 cities)
  • Credit card numbers from the major credit card companies (Visa, Mastercard, Amex…)
  • Danish CPR number (check on wikipedia)
  • Email addresses (RFC 5322 Official Standard)
  • ¬†Italian Codice Fiscale (check on wikipedia)
  • USA Social security codes (this regex gives a lot of false positives, but since the pattern is so simple, it’s to be expected)
Scape: sensitive Info 1

UK Passport and National Insurance Number

Upon a user request we put together a database of UK Documents :

  • UK Passport based on this regex
  • UK National Insurance number¬†based on this regex
Scan drives for UK passport number and UK National insurance number

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close