Metrics¶
StatsD Metrics in Socorro¶
Socorro uses StatsD with DogStatsD extensions.
Table of metrics:
Key |
Type |
---|---|
timing |
|
gauge |
|
incr |
|
incr |
|
incr |
|
incr |
|
gauge |
|
gauge |
|
gauge |
|
gauge |
|
gauge |
|
gauge |
|
gauge |
|
incr |
|
incr |
|
timing |
|
histogram |
|
histogram |
|
incr |
|
timing |
|
timing |
|
incr |
|
timing |
|
timing |
|
incr |
|
timing |
|
timing |
|
gauge |
|
incr |
|
incr |
|
incr |
|
incr |
|
timing |
|
incr |
|
incr |
|
incr |
|
incr |
|
timing |
Metrics details:
- socorro.cron.job_run¶
Type:
timing
Duration of how long it took to run the cron job.
Tags:
job
: short string for the job that failedresult
:success
orfailure
- socorro.cron.verifyprocessed.missing_processed¶
Type:
gauge
Gauge of crash reports for which there was no processed crash file.
- socorro.processor.betaversionrule.cache¶
Type:
incr
Counter for whether the BetaVersionRule pulled version information from cache or not.
Tags:
result
:hit
ormiss
- socorro.processor.betaversionrule.lookup¶
Type:
incr
Counter for whether the BetaVersionRule did a lookup using the Crash Stats VersionString API.
Tags:
result
:success
orfail
- socorro.processor.cache_manager.evict¶
Type:
incr
Counter for file evictions.
- socorro.processor.cache_manager.q_overflow¶
Type:
incr
Counter for inotify Q_OVERFLOW events in cache manager.
- socorro.processor.cache_manager.usage¶
Type:
gauge
Gauge for total size of cache. In bytes.
- socorro.processor.cache_manager.file_sizes.avg¶
Type:
gauge
Gauge for the average file size for files in the cache. In bytes.
- socorro.processor.cache_manager.file_sizes.median¶
Type:
gauge
Gauge for the median file size for files in the cache. In bytes.
- socorro.processor.cache_manager.file_sizes.ninety_five¶
Type:
gauge
Gauge for the 95 percentile file size for files in the cache. In bytes.
- socorro.processor.cache_manager.file_sizes.max¶
Type:
gauge
Gauge for max file size in cache. In bytes.
- socorro.processor.cache_manager.files.count¶
Type:
gauge
Total number of files in the cache.
- socorro.processor.cache_manager.files.gt_500¶
Type:
gauge
Total number of files in cache greater than 500mb.
- socorro.processor.denonerule.had_nones¶
Type:
incr
Counter for how many crash annotation values were
None
.All crash annotation values should be strings, so
None
isn’t valid and usually comes from a bug in the crash reporter.
- socorro.processor.denullrule.has_nulls¶
Type:
incr
Counter for how many nulls were in keys and values for crash annotations.
- socorro.processor.dest1.save_processed_crash¶
Type:
timing
Used in tests.
- socorro.processor.es.crash_document_size¶
Type:
histogram
Size of crash document. In bytes.
- socorro.processor.es.index¶
Type:
histogram
Total time it took to index the crash document in Elasticsearch.
- socorro.processor.es.indexerror¶
Type:
incr
Counter for errors when indexing a document in Elasticsearch.
Tags:
error
: the error code indicating what happened
- socorro.processor.es.save_processed_crash¶
Type:
timing
Timer for how long it takes to save the processed crash to Elasticsearch.
- socorro.processor.ingestion_timing¶
Type:
timing
Timer for how long it took for a crash report to be ingested. This is the time between the submitted timestamp all the way through when processing was completed.
This uses the
submitted_timestamp
from the collector as the start time.
- socorro.processor.minidumpstackwalk.run¶
Type:
incr
Counter for minidump stackwalk executions.
Tags:
outcome
: eithersuccess
orfail
exitcode
: the exit code of the minidump stackwalk process
- socorro.processor.process_crash¶
Type:
timing
Timer for how long it takes to process a crash report.
Tags:
ruleset
: the ruleset used for processing
- socorro.processor.rule.act.timing¶
Type:
timing
Timer for how long it takes for the rule to run.
Tags:
rule
: rule class name
- socorro.processor.save_processed_crash¶
Type:
incr
Counter for number of crash reports successfully processed and saved to storage.
- socorro.processor.storage.save_processed_crash¶
Type:
timing
Timer for how long it takes to save the processed crash to storage bucket.
- socorro.processor.telemetry.save_processed_crash¶
Type:
timing
Timer for how long it takes to save the processed crash to Telemetry storage bucket.
- socorro.processor.truncatestackrule.stack_size¶
Type:
gauge
Gauge for stack sizes.
- socorro.processor.truncatestackrule.truncated¶
Type:
incr
Counter for stacks that were truncated because they were too large.
- socorro.sentry_scrub_error¶
Type:
incr
Emitted when there are errors scrubbing Sentry events. Monitor these because it means we’re missing Sentry event data.
Tags:
service
:webapp
,submitter
,processor
orcache_manager
- socorro.submitter.accept¶
Type:
incr
Counter for how many destinations the crash report was resubmitted to.
- socorro.submitter.ignore¶
Type:
incr
Counter for how many destinations were ignored for resubmitting the crash report.
- socorro.submitter.process¶
Type:
timing
Timer for how long it takes to process a crash report which involves figuring out where the crash report should get sent to, downloading the data, creating the payload, and submitting it.
- socorro.submitter.unknown_finished_func_error¶
Type:
incr
Counter for how many unknown finished func errors were encountered.
- socorro.submitter.unknown_process_error¶
Type:
incr
Counter for how many unknown process errors were encountered.
- socorro.submitter.unknown_submit_error¶
Type:
incr
Counter for how many unknown submit errors were encountered.
- socorro.webapp.crashstats.models.cache_set_error¶
Type:
incr
Counter for errors when caching middleware model request results.
- socorro.webapp.view.pageview¶
Type:
timing
Timer for how long it takes to handle an HTTP request.
Tags:
ajax
: whether or not the request was an AJAX requestapi
: whether or not the request was an API request (path starts with/api/
)path
: the path of the requeststatus
: the HTTP response code