Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

Navigation
Guide That Contains This Content
[+] Expand All
[-] Collapse All

    Tracking Processors: Client Classification Processor

    The client classification processor is designed to detect popular legitimate search engine bots. These types of bots are notorious for performing aggressive spider activity on web sites, and often this activity can trigger security related incidents. Using this processor to define the conditions used to identify such bots, allows the system to ignore security incidents from those clients. This will remove search engine related false positives, as well as prevent errors in indexed and cached results. The popular search engines are included by default, but if additional search engines should be allowed, new rules can be created. Be careful not to define a rule that will match clients other than the targeted search engine bot. The less specific the conditions of a rule, the easier it will be for an attacker to spoof the search engine and circumvent detection. It is critical that DNS be enabled on WebApp Secure to achieve effective classification of search engines. Not enabling DNS and leaving this processor turned on, may result in some attackers not being identified.

    If a client is classified as a search engine based on one of the defined rules, then that client will not be able to generate incidents, and additionally:

    • Query String Processor will be turned off for that user (no query param injections)
    • Hidden Link Processor will be turned off for that user (no hidden link injections)

    This is done to ensure that the results cached by the search engine bot do not include fake code that may change in the future, and thus end up flagging clients who are following legitimate search engine links. Classification rules are made up of a series of patterns to run against various attributes of the client:

    • IP Address
    • Hostname
    • User Agent
    • Country Code
    • City
    • Region
    • Header Name and Value

    At least one pattern must be specified on at least one attribute, however you can specify patterns for as many attributes as the bot will allow. For example, if the bot changes its IP address constantly, then you should not define a pattern for the IP. However if the hostname always ends in google.com, then a pattern of [.]google[.]com$ could be assigned to the “Hostname” attribute. If the user agent always contains “googlebot”, then “googlebot” could be assigned as the user agent pattern. Here is an example of a complete pattern for the Googlebot search engine spider:

    Hostname Pattern: [.]google(bot)?[.]com$User Agent Pattern: (adsbot.google|googlebot|Google[ ]Web[ ]Preview|Mediapartners-Google)Country Pattern: USRegion Pattern: (California|Georgia)

    Note: It would be extremely difficult for an attacker to spoof values for all of those attributes which would match the patterns. For example, spoofing the reverse DNS lookup to end in “.google.com” would require serious effort, and would require insecure DNS configuration on behalf of the WebApp Secure administrator. Ideally every rule should include either an “ip” or “hostname” pattern.

    Table 1: Client Classification Configuration Parameters

    Parameter

    Type

    Default Value

    Description

    Basic

    Processor Enabled

    Boolean

    False

    Whether traffic should be passed through this processor.

    Classification Rules

    Client Type

    String

    (none)

    The name of the type of client being identified.

    IP Pattern

    String

    (none)

    The IP address pattern to require (if any).

    Hostname Pattern

    String

    (none)

    The hostname pattern to require (if any) if DNS is enabled.

    User Agent Pattern

    String

    (none)

    The user agent pattern to require (if any).

    Country Pattern

    String

    (none)

    The country pattern to require (if any).

    City Pattern

    String

    (none)

    The city pattern to require (if any).

    Region Pattern

    String

    (none)

    The region pattern to require (if any).

    Header Name Pattern

    String

    (none)

    A pattern used to identify a required header name (if any).

    Header Value Pattern

    String

    (none)

    A pattern used to verify the value of a header that matches the header name pattern (if any).

    Published: 2013-11-20