[M] UrlsDict, UrlsMask & UrlsCombine
Common description
Using for URLs search on target web-sites. Search made by template with mark symbol which replacing by target name.
Like «Not Found» response signature (404 status code) you may using regexps (matches searching in response headers and body), response size, different status codes. Besides, you may use param for regexp for detecting found pages (status code 200 analogue).
If you specify more than one signature of positive/negative responses, they be have next priority:
Regexp for detect positive response (status code 200 analogue)
Size of negative response
Regexp for detect negative response (status code 404 analogue)
Status codes which negative responses
If one of this signatures detect existing item, it stop other detections. Example. You specify regexp «found» like positive response signature. WS make request of item X and get response with status code 404 and phrase «found» in body. WS get it as existing element (like status code 200) because regexp of positive response more priority than status codes check.
Module can search objects by mask, dictionary and by combination (mask + dict).
Module working in «raw» and «selenium» modes.
In config.ini you can find skip_listing option. If it enabled (by default), WS will not work with directory listing (if target URL it is index of directory).
Examples
Simple search by dict:
Simple search by mask:
Search by dict in selenium mode:
Search on site with 200 response on not found pages requests:
Search with response code 500 like 404:
Search with retry request if response code is 503:
Search with retry request if response contains phrase "Too big load":
Options (* - necessary)
Name
By default
R
S
Description
--template *
Yes
Yes
--dict *
Yes
Yes
For UrlsDict. Path to dictionary
--mask *
Yes
Yes
For UrlsMask. Symbols mask.
--combine-template *
Yes
Yes
for UrlsCombine. Template for combined work. String with markers «%m%» and «%d%», which is place for dict word and mask word.
--found-re
Yes
Yes
RegEx (python.re) for check positive web-server response (code 200 analogue)
--not-found-re
Yes
Yes
RegEx (python.re) for check negative web-server response (code 404 analogue)
--not-found-size
Yes
Yes
Size of negative answer (code 404 analogue). Remember, this size can be different in different tools. Use test mode for get right size.
--not-found-codes
Yes
No
Status codes, analogues of 404. Separated by comma.
--method
GET
Yes
No
HTTP-method for work: HEAD, POST, GET.
--proxies
Yes
Yes
HTTP-proxy list.
--retest-re
Yes
Yes
RegEx (python.re) for check if request repeat is need. For example «Service Temporarily Unavailable».
--retest-codes
Yes
No
Set of status codes (separated by comma) as signature for request re-send.
--headers-file
Yes
No
File with HTTP headers for put it in work requests.
--ignore-words-re
Yes
Yes
RegEx (python.re) for ignoring target phrases. May be useful when you don't want check some phrases, for example contains “.ht”.
--msymbol
@
Yes
Yes
Mark symbol for search template (--template)
--delay
0
Yes
Yes
Delay in seconds between requests. It's options not for all threads together, it's for every thread separately.
--threads
10
Yes
Yes
Work threads count.
--parts
0
Yes
Yes
Split on X parts target dict or mask.
--part
0
Yes
Yes
Which part number we using in work?
--test
0
Yes
Yes
Test mode enable
--xml-report
0
Yes
Yes
Path to save xml-report
--selenium
0
No
Yes
Selenium-mode enable
--browser-recreate-re
No
Yes
RegEx (python.re) for detect browser recreation need. If you using proxies, browser select new one.
--browser-wait-re
No
Yes
RegEx (python.re). If match, browser stop working and will wait for match disappear. You may use it for solve captcha by hands or wait for anti-ddos check («wait 5 secs, we check your browser»).
Last updated
Was this helpful?