Web Scrapers Generator BrowserExt

Export Mode

You can go to the Export mode by clicking on the "Export" tab at the right side of the Editor. It intends for creating and editing Export profiles.

export-en-small.jpg

There are three types of the Export profiles - cvs (cvs, Excel), non cvs (xml, sql) and RDB.

Order of data sequence in the raw or a table is specified in the cvs type profiles. In the non cvs type of profiles the template, used for every type of the exported data, is given.

There is a "Write to file" or "Write to database" checkbox in the Export profiles settings which is checked by default. It means that formed data would be written to the file or database. Disabling it means that data won't be written to the file or database but will be kept in the memory and returned by store function.

Exports Based On Templates - XML and SQL

In the template parameters which would be replaced with data during the export are set up. Parameter starts with $ sign and is bracketed in braces { and }. For example, {$title}. Function can be applied for the parameter, for example, {notencode($title)}.

Xml and sql rules are different by symbols escaping.

All data is saved in utf-8 encoding.

Functions In The Export Templates

notencode(param)
Cancel special symbols escaping. Data is written as is.

Export Template Example

<item>
<var1>{$var1}</var1>
<var2>{$var2}</var2>
<var3>{$var3}</var3>
</item>
    

Export To Relational Database - RDB

Export RDB intended for storing the collected data in a predetermined database based on a predetermined database schema described, for example Opencart. User is provided with patterns of datasets consisting of certain fields. Typically, the field names correspond to the names of fields in the administartion panel of CMS.

export-rdb1.jpg

For example, the Category dataset consists of the fields Category Name, Category Description, Language, etc. When you create an export profile is always given one set of data.

Let's have product pages with its attributes. Schematically depict them as follows:

export-rdb2.jpg

Then the dataset will look like this:

Product
Language Product Name Product Price Product Specification
English Product1 100 Product Specification
Attribute Group Name Group Attributes
Group1 Group Attributes
Attribute1 Value1
Attribute2 Value2
English Product2 150 Product Specification
Attribute Group Name Group Attributes
Group1 Group Attributes
Attribute1 Value1
Group2 Group Attributes
Attribute3 Value3

Let's create export profiles for Opencart database with datasets Product, Product Specification and Group Attributes. We call profiles oc-product, oc-productspec and oc-attr respectively. The latter two profiles will serve for the formation of the data in memory and adding the data from oc-product for nesting. Next, you must associate the fields in the dataset with necessary scraping rules, and not need to fill out all fields. In the script export using the function store will look like this: