Extracting event logs from a cable modem

As part of monitoring my home network, I wanted to capture the event logs from my Arris SB6190 cable modem, so I could watch for any recurring problems. The modem has an “Event Log” page, so capturing the data is just a matter of fetching the HTML page, converting it to XHTML, and using XML tools to grab and reformat the information.

Here is a Python program that uses the requests and tidylib libraries. These should be installed with pip if they are not already. HTML Tidy should also be available.

The program also uses xml.etree.ElementTree, which should be preinstalled, to process the XML. I find this package much more cumbersome to work with than XQuery, but being able to issue HTTP requests and call HTML Tidy from within the program worth the effort.

The program is called like this:

python modem.py subdir clear

Where subdir is the name of the subdirectory where the XML output file will be written, and clear is an optional keyword that indicates that the event log should be cleared after it is downloaded. (The input source does not need to be specified, because it is always the same: http://192.168.100.1/cgi-bin/eventlog).

The SB6190 modem does not have a battery-backed clock, so the first couple of messages after a reboot have a timestamp of “Thu Jan 01 00:00:00 1970” (all zeros in Unix time notation). This is a problem if you want to sort the messages by date and time, because the zero-timestamp messages will all sort to the front.

This program fixes the timestamps by taking the timestamp of first subsequent message with a valid timestamp, and using its timestamp minus one second. Since it is working backwards from the last message, it can adjust the timestamps for a run of zero-timestamp messages.

An example of the output is shown at the top of this article.

This same program, with a few modifications that are listed in the comments, can also be used to extract event logs from ARRIS SB6121 cable modems.

Note: I have made a few changes from the version of the program that I originally posted, to accommodate running it periodically with a cron job:

  • The first parameter is now the name of the subdirectory where the output file should be written.
  • The output file is given a name with the time and date, like this:
    sb6190-log-yyyy-mm-dd-hh-mm-ss.xml.
  • In order not to pollute the directory with files consisting of just the XML header and <events />, the output file will be written only when there are one or more events in the log.
  • The error messages from the first HTML Tidy step are no longer printed. We know they are there, and we don’t care, and if we print them, they get sent in a mail to root, cluttering up its mailbox. By the same token, the message acknowledging that the clear parameter was specified has been removed.
  • If the clear parameter is specified, the clear is not actually done unless there are events to be cleared. I would hope that the modem is smart enough to not actually do anything if there are no events to clear, but on the off chance that it writes something anyway, this will save a little wear and tear on the internal SSD memory.

Here is the program:

##
# This program retrieves the event log from an Arris SB6190
# cable modem, and converts it to XML. Optionally, it
# can clear the log immediately after retrieving it.
#
# The first parameter to this program is the
# name of the subdirectory for the output file.
# The file will be named
# sb6190-log-yyyy-mm-dd-hh-mm-ss.xml.
#
# The second parameter, which is optional, is
# "clear" (without the quotation marks) if you
# want the program to clear the event log
# in the cable modem.
#
#
# With a few modifications, it can also be used with the
# Arris SB6121 cable modem:
#
# - The URL for retrieving the log is
#   http://192.168.100.1/cmLogsData.htm
# - The format of the Time column is different so,
#   for example, instead of "Thu Dec 28 14:47:20 2023",
#   it would have "Dec 28 2023 14:47:20".
#   week, so instead of
# - The Priority column includes the character version
#   of the priority so, for example, instead of "6",
#   it would say "6-Notice".
# - The is an additional column, "Code", between the
#   Priority and Message (Description) columns.
#
#
# License: Public Domain
#
# To the extent possible under law, Lynn Grant has waived
# all copyright and related or neighboring rights to
# this program. This work is published from: United States.
#


import datetime
import xml.etree.ElementTree as ET
import requests
import sys
from tidylib import tidy_document

tidyOptions1 = {"indent": 1, "output-xhtml": 1}
tidyOptions2 = {"input-xml": 1, "indent": 1}

headings = ["time", "priority", "description"]

url = "http://192.168.100.1/cgi-bin/eventlog"

clearLog  = {"SnmpClearEventLog": "1"}

dateElements = []
timeElements = []
tdStamps = []

oneSec = datetime.timedelta(seconds=1)

noStamp = datetime.datetime(year=1970, month=1, day=1, hour=0, minute=0, second=0)

##
# Convert a date of the form "Thu Dec 28 14:47:04 2023" to an datetime instance and
# save it. Also generate empty <date> and <time> tags, and save their handles.
# In postprocessing, we will use the saved datetime instances to fill in the
# saved <date> and <time> tags.
#
# @param event - The ET handle for the parent <event> element.
# @param date  - The date, in the form "Thu Dec 28 14:47:04 2023"
# @returns - Zero is always returned.
#

def convDate(event, date):
   splitDate = date.split(" ")  # 0 - day of week, 1 - month, 2 - day of month,
                             # 3 - time, 4 - year.
   splitTime = splitDate[3].split(":")  # 0 - hour, 1 - minute, 2 - second

   C_MONTH = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]
   N_MONTH = [ "01",  "02",  "03",  "04",  "05",  "06",  "07",  "08",  "09",  "10",  "11",  "12"]

   year = int(splitDate[4])

   month = "00"
   for j in range(0, len(C_MONTH)):
      if (splitDate[1] == C_MONTH[j]):
          month = N_MONTH[j]
   month = int(month)

   day = int(splitDate[2])

   hour = int(splitTime[0])

   minute = int(splitTime[1])

   second = int(splitTime[2])

   tdStamp = datetime.datetime(year=year, month=month, day=day, hour=hour, minute=minute, second=second)
   tdStamps.append(tdStamp)

   date = ET.SubElement(event, "date")
   dateElements.append(date)

   time = ET.SubElement(event, "time")
   timeElements.append(time)

   return 0


##
# Convert a numberic priority into an extended priority with the name of the priority.
#
# This routine generates a <priority> element with the numeric priority, and a
# <extprioriy> with the name of the priority appended, like "3-Critical".
#
# @param event - The ET handle for the parent <event> element.
# @param priority - The numeric priority. Type is string.
# @returns - Zero is always returned.
#

def convPrio(event, priority):

    # The priority levels are defined in RFC 4639, "DOCSIS Cable Device MIB",
    # and also section 8.1.2.3, "Standard DOCSIS Events for CMs", of the
    # DOCSIS 3.1 Cable Modem Operations Support System Interface Specification,
    # CM-SP-CM-OSSIv3.1-I25-231012.
    #
    # In the definitions below, "CM" stands for "Cable Modem", and "CMTS"
    # stands for "Cable Modem Termination System", such as a head-end
    # router.

    C_PRIORITY = ["",              # Filler entry, since priority levels
                                   # start 1, not zero.

                  "Emergency",     # 1 - Reserved for vendor-specific 'fatal'
                                   #     hardware or software errors that prevent

                                   #     normal system operation and cause the
                                   #     reporting system to reboot.
                                   #
                                   #     Every vendor may define their own set
                                   #     of emergency events. Examples of such
                                   #     events might be 'no memory buffers
                                   #     available', 'memory test failure', etc.

                  "Alert",         # 2 - A serious failure, which causes the
                                   #     reporting system to reboot, but it is
                                   #     not caused by hardware or software
                                   #     malfunctioning.

                  "Critical",      # 3 - Examples of such events might be
                                   #     configuration file problems detected
                                   #     by the modem or inability to get an
                                   #     IP address from the DHCP server.

                  "Error",         # 4 - A failure ocurred that could
                                   #     interrupt the normal data flow, but
                                   #     will not cause the modem to re-register.
                                   #     Error events could be reported in real
                                   #     time by using the trap or syslog mechanism.

                  "Warning",       # 5 - A failure occurred that could interrupt
                                   #     normal data flow, but will not cause the
                                   #     modem to re-register. 'Warning' level is
                                   #     assigned to events that both CM and CMTS
                                   #     have information about. To prevent sending
                                   #     the same event, both from the CM and the
                                   #     CMTS, the trap and syslog reporting
                                   #     mechanism is disabled by default for the
                                   #     CM for this level.

                  "Notice",        # 6 - The event is important, but not a failure
                                   #     and could be reported in real time by
                                   #     using the trap or syslog mechanism.
                                   #     For a CM, an example of a Notice event is
                                   #     any event from 'SW UPGRADE SUCCESS'
                                   #     group.

                  "Informational", # 7 - The event is of marginal importance, and
                                   #     is not a failure, but could be helpful
                                   #     for tracing the normal modem operation.

                  "Debug"]         # 8 - Reserved for vendor-specific non-critical
                                   #     events.

    iPriority = int(priority)

    if (iPriority >= 1 and iPriority <= 8):
        charPriority = C_PRIORITY[iPriority]
    else:
        charPriority = ""

    prio = ET.SubElement(event, "priority")
    prio.text = priority

    extPrio = ET.SubElement(event, "extpriority")
    extPrio.text = priority + "-" + charPriority

    return 0


##
# Process the description field.
#
# Here is an example of a description field:
#
#   Lost MDD Timeout;CM-MAC=8c:5c:27:d7:f4:7d;CMTS-MAC=00:01:5d:4b:22:34;CM-QOS=1.1;CM-VER=3.0;
#
#   The main description is followed by several fields, separated by semicolons, that give
#   additional information, such as the MAC address of the cable modem and the head-end router,
#   the quality-of-service level, and the DOCSIS version.
#
#   A <description> element is generated for the main description. For the the additional fields,
#   the part before the equal sign is used as the element name, such as <CM-MAC>, and the part
#   after is used as its data.
#
#   @param event - The ET handle for the parent <event> element.
#   @param desc  - The description field to be processed.
#   @returns - Zero is always returned.
#

def convDesc(event, desc):

    # The description looks like this: Lost MDD Timeout;CM-MAC=8c:5c:27:d7:f4:7d;CMTS-MAC=00:01:5d:4b:22:34;CM-QOS=1.1;CM-VER=3.0;
    splitDesc = desc.split(";")

    for j in range(0, len(splitDesc)):
        if (j == 0):
            descElement = ET.SubElement(event, "description")
            descElement.text = splitDesc[0]
        elif (splitDesc[j] != ""):
            resplit = splitDesc[j].split("=")
            entry = ET.SubElement(event, resplit[0])
            entry.text = resplit[1]

    return 0


# Field conversion routines:
PROCESS = [convDate,  # Date
           convPrio,  # Priority
           convDesc]  # description

##### Mainline #####

# Get the parameters.
subDir = sys.argv[1]
if (len(sys.argv) > 2 and sys.argv[2] == "clear"):
    clearLogRequested = True
else:
    clearLogRequested = False

# Get the current time and date
now = datetime.datetime.today()
nowString = now.isoformat(sep="-",timespec="seconds")
nowString = nowString.replace(":", "-")

# Generate the output file name
outFile = subDir + "/sb6190-log-" + nowString + ".xml"

# Retrieve the event log as HTML from the cable modem.
res = requests.get(url)
rawHtml = res.text

# Convert it to XHTML which, of course, is also XML.
xhtml, errors = tidy_document(rawHtml, tidyOptions1)

# Print the error messages. There will always be a few,
# but they are important. We print them just to monitor
# things, in case there is ever a real problem.
# print ("Input HTML errors:\n", errors)

# Parse the XHTML string to make an XML tree.
root = ET.fromstring(xhtml)

# Generate the root of the new XML tree we are generating.
newRoot = ET.Element('events')

# Find the event table. There will be only one table in the document.
for table in root.iter('{http://www.w3.org/1999/xhtml}table'):

   for tr in table.iter('{http://www.w3.org/1999/xhtml}tr'):

       # The last row of the table has not text; it just contains
       # the "Clear Log" button. We do not create an <event>
       # element for this row until we are sure it is a message
       # row.
       event = ""
       col = 0
       for td in tr.iter('{http://www.w3.org/1999/xhtml}td'):
          text = str(td.text).strip()
          if (text != "" and text != "CM Events List Empty"):
             # This is a message row. If we have not created
             # the <event> element for this row, do so now.
             if (event == ""):
                event = ET.SubElement(newRoot, "event")

             # Look up the appropriate function to process
             # this column and process it.
             func = PROCESS[col]
             if (func != ""):
                text = func(event, text)
                col = col + 1

# Fill in the values for the <date> and <time> elements, in reverse order.
#
# The SB6190 and SB6121 cable modems do not have a battery-backed TOD clock.
# So the first two or three messages after a reboot have a zero timestamp,
# which translates (in Unix terms) to "Thu Jan 01 00:00:00 1970".
#
# If you sort the messages by date, all the zero-timestamp messages sort to the
# front, making it difficult to correlate them with the surrounding messages.
# So here is what we do: we run through the list of timestamps from last to first.
# When we find a zero timestamp, we take the timestamp of the next message, subtract
# one second, and use if for the timestamp. Since we are processing in reverse order,
# if there are several zero-timestamp messages, each one get the timestamp of the
# previous one, minus one second.
#
# There are two edge conditions we have to keep in mind:
# - This scheme will fail if the last message in the event log has a zero timestamp.
#   This seems unlikely, as there is always at least one message with a good timestamp
#   following the zero-timestamp ones, and it is probably not possible to download the
#   event log before it is issued, since the interface is still coming up after a
#   reboot.
# - What if there are so many zero-timestamp messages that we end up with the first
#   on having an earlier timestamp than the good timestamp preceding it? This seems
#   unlikely, since there are generally only one or two messages with zero timestamps
#   generated before the clock is available, and it takes more than two seconds for
#   a reboot to occur.

j = len(tdStamps) - 1
while (j >= 0):

    # If this is a zero timestamp, take the next one and back it down by one second.
    if (tdStamps[j] == noStamp):
        tdStamps[j] = tdStamps[j+1] - oneSec

    # Fill in the <date> and <time> elements.
    tdString = str(tdStamps[j]).split(" ")  # 0 - date: yyyy-mm-dd, 1 - time: hh:mm:ss
    dateElements[j].text = tdString[0]
    timeElements[j].text = tdString[1]
    j = j - 1

# Run the generated XML through HTML tidy to pretty-print it.
xml_data = '<?xml version="1.0" encoding="UTF-8"?>' + ET.tostring(newRoot).decode('UTF-8')
tidy_xml, errors = tidy_document(xml_data, tidyOptions2)

# If there are errors, display them.
if (errors != ""):
   print ("Output XML errors\n", errors)

# If we had any events, write out the new XML
if (len(tdStamps) > 0):

   # Clear the event log, if requested to do so.
   # Note: We do this only if we had some events.
   # This saves wear and tear on the SSD memory
   # in the cable modem.
   if (clearLogRequested):
      clearResponse = str(requests.post(url, clearLog))
      if (clearResponse != "<Response [200]>"):
         print ("Unexpected response to Clear Log\n", clearResponse)

   # Write out the new XML
   of = open(outFile, "wt")
   of.write(tidy_xml)
   of.close

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *