One of the improvements in Snort 3 is Enhanced JavaScript Normalizer which has its
own module and can be used with any service inspectors where JavaScript code might occur.
Currently it is supported for the following inspectors: HTTP, SMTP, IMAP, POP.

==== Overview

You can configure it by adding:

    js_norm = {}

to your snort.lua configuration file. Or you can read about it in the
source code under src/js_norm.

Having 'js_norm' module configured and ips option 'js_data' in the rules automatically
enables Enhanced Normalizer.

The Enhanced Normalizer can normalize JavaScript embedded in HTML (inline scripts),
in separate .js files (external scripts), and JavaScript embedded in PDF files sent over HTTP/1,
HTTP/2, SMTP, IMAP and POP3 protocols. It supports scripts over multiple PDUs. It is a stateful
JavaScript whitespace and identifiers normalizer. Normalizer concatenates string literals whenever
it's possible to do. This also works with any other normalizations that result in string literals.
All JavaScript identifier names, except those from the ignore lists, will be substituted with unified
names in the following format: var_0000 -> var_ffff. The Normalizer tries to expand escaped text, so
it will appear in a readable form in the output. When such text is a parameter of an unescape function,
the entire function call will be replaced by the unescaped string. Moreover, Normalizer validates the
syntax concerning ECMA-262 Standard, including scope tracking and restrictions for script elements.
JavaScript, embedded in PDF files, has to be decompressed before normalization. For that,
decompress_pdf = true option has to be set in configuration of appropriate service inspectors.

Check with the following options for more configurations: bytes_depth, identifier_depth,
max_tmpl_nest, max_bracket_depth, max_scope_depth, ident_ignore, prop_ignore.

Enhanced normalizer is the preferred option for writing new JavaScript related rules, though
legacy normalizer (part of http_inspect) is still available to support old rules.

==== Configuration

Configuration can be as simple as adding:

    js_norm = {}

to your snort.lua file. The default configuration provides a thorough
normalization and may be all that you need, but there are some options that
provide extra features, tweak how things are done, or conserve resources by
doing less.

Also, there are default lists of ignored identifiers and object properties provided.
To get a complete default configuration, use 'default_js_norm' from $SNORT_LUA_PATH/snort_defaults.lua
by adding:

    js_norm = default_js_norm

to your snort.lua file.

Enhanced JavaScript Normalizer implements JIT approach. Actual normalization takes place
only when js_data option is evaluated. This option is also used as a buffer selector for
normalized JavaScript data.

===== bytes_depth

bytes_depth = N {-1 : max53} will set a number of input JavaScript
bytes to normalize. When the depth is reached, normalization will be stopped.
It's implemented per-script. By default bytes_depth = -1, will set
unlimited depth.

===== identifier_depth

identifier_depth = N {0 : 65536} will set a number of unique
JavaScript identifiers to normalize. When the depth is reached, a built-in
alert is generated. Every response has its own identifier substitution context,
which means that identifier will retain same normal form in multiple scripts,
if they are a part of the same response, and that this limit is set for a single
response and not a single script. By default, the value is set to 65536, which
is the max allowed number of unique identifiers. The generated names are in
the range from var_0000 to var_ffff.

===== max_tmpl_nest

max_tmpl_nest = N {0 : 255} (default 32) is an option of the enhanced
JavaScript normalizer that determines the deepest level of nested template literals
to be processed. Introduced in ES6, template literals provide syntax to define
a literal multiline string, which can have arbitrary JavaScript substitutions,
that will be evaluated and inserted into the string. Such substitutions can be
nested, and require keeping track of every layer for proper normalization. This option
is present to limit the amount of memory dedicated to template nesting tracking.

===== max_bracket_depth

max_bracket_depth = N {1 : 65535} (default 256) is an option of the enhanced
JavaScript normalizer that determines the maximum depth of nesting brackets, i.e. parentheses,
braces and square brackets, nested within a matching pair, in any combination. This option
is present to limit the amount of memory dedicated to bracket tracking.

===== max_scope_depth

max_scope_depth = N {1 : 65535} (default 256) is an option of the enhanced
JavaScript normalizer that determines the deepest level of nested variable scope,
i.e. functions, code blocks, etc. including the global scope.
This option is present to limit the amount of memory dedicated to scope tracking.

===== ident_ignore

ident_ignore = {<list of ignored identifiers>} is an option of the enhanced
JavaScript normalizer that defines a list of identifiers to keep intact.

Identifiers in this list will not be put into normal form (var_0000). Subsequent accessors,
after dot, in square brackets or after function call, will not be normalized as well.

For example:

    console.log("bar")
    document.getElementById("id").text
    eval("script")
    console["log"]

Every entry has to be a simple identifier, i.e. not include dots, brackets, etc.
For example:

    js_norm.ident_ignore = { 'console', 'document', 'eval', 'foo' }

When a variable assignment that 'aliases' an identifier from the list is found,
the assignment will be tracked, and subsequent occurrences of the variable will be
replaced with the stored value. This substitution will follow JavaScript variable scope 
limits.

For example:

    var a = console.log
    a("hello") // will be substituted to 'console.log("hello")'

For class names and constructors in the list, when the class is used with the
keyword 'new', created object will be tracked, and its properties will be kept intact.
Identifier of the object itself, however, will be brought to unified form.

For example:

    var o = new Array() // normalized to 'var var_0000=new Array()'
    o.push(10) // normalized to 'var_0000.push(10)'

The default list of ignore-identifiers is present in "snort_defaults.lua".

Unescape function names should remain intact in the output. They ought to be
included in the ignore list. If for some reason the user wants to disable unescape
related features, then removing function's name from the ignore list does the trick.

===== prop_ignore

prop_ignore = {<list of ignored properties>} is an option of the enhanced
JavaScript normalizer that defines a list of object properties and methods that
will be kept intact during normalization of identifiers. This list should include
methods and properties of objects that will not be tracked by assignment substitution
functionality, for example, those that can be created implicitly.

Subsequent accessors, after dot, in square brackets or after function call, will not be
normalized as well.

For example:

    js_norm.prop_ignore = { 'split' }

    in: "string".toUpperCase().split("").reverse().join("");
    out: "string".var_0000().split("").reverse().join("");

The default list of ignored properties is present in "snort_defaults.lua".

==== Detection rules

Enhanced JavaScript Normalizer follows JIT approach, which requires rules with
'js_data' IPS option to be executed. This can lead to missed data when js_data
option is not evaluated for some packets, e.g. if there is a non-js_data fast
pattern. In this case, when fast pattern doesn't match, JavaScript normalization
is skipped for the current PDU. If later js_data IPS rule matches again,
a missed normalization context is detected and 154:8 built-in alert is raised.
Further normalization is not possible for the script.
For example:

    alert http (msg:"JS in HTTP"; js_data; content:"var var_0000"; sid:1;)
    alert smtp (msg:"JS in SMTP"; js_data; content:"var var_0000"; sid:2;)

===== js_data

The js_data IPS contains normalized JavaScript text collected from the whole PDU.
It requires the Enhanced JavaScript Normalizer configured.

==== Trace messages

When a user needs help to sort out things going on inside Enhanced JavaScript Normalizer,
Trace module becomes handy.

    $ snort --help-module trace | grep js_norm

Messages for the enhanced JavaScript Normalizer follow
(more verbosity available in debug build):

===== trace.module.js_norm.proc

Messages from script processing flow and their verbosity levels:

1. Script opening tag location.

2. Attributes of the detected script.

3. Return codes from Normalizer.

===== trace.module.js_norm.dump

JavaScript data dump and verbosity levels:

1. js_data buffer as it is passed to detection.

2. (no messages available currently)

3. Current script as it is passed to Normalizer.

