Web Scrapers Generator BrowserExt

gettext

Gets text content on xpath given. Availably only inside the loadpage.

[@auto] var = gettext(xpath[, options = dict()]);
    

Parameters:

xpath Xpath expression string definig the node address.
options Parameters dictionary. Could contain the following elements:
nodeonly If the parameter is set to 'true', gettext will return text nodes, belonging right to node requested. Otherwise the text consisting of all the text nodes of all nested elements will be returned. See Example 1.
next If this parameter is set to "true", gettext will return text content of the node next to requested node. See Example 1.
word The integer, giving the word's number will be returned. Words are counted from one. If few words are needed to be returned, this parameter is used along with wordend. See Example 1.
wordend The integer giving the number for the last returned word. Hereby the string, starting with word and ending with wordend positions will be returned. See Example 1.
join If this parameter is set, then gettext concatenate all elements of the result array into a single string using a separator specified as a parameter value. The result will be an array with one element - a string.
replace If this parameter is set, it will replace the string specified regular expression with another string. The value of the parameter must be specified array consisting of two elements: the first - the regular expression to search, and the second - a string on which to replace. Example: array('/\d*/', '') - find all the numbers in a row and removes them.
html By means of this parameter html code can be passed, then the function will look for elements by the xpath in this html code but not in the loaded page.

Returned value: an array of strings.

@auto directive is used if script changing by means of "built-in browser" is needed.

Example 1. There is html-page with the http://test.com/test.html address, containing the following code:

<html>
<body>
<div id="123">
    text1
    <div>text2</div>
    <div>text3</div>
</div>
<div>345</div>
<div id="678">one two three four</div>
</body>
</html>

Lets take a look on different meanings of the options dictionary: