We have some stats being calculated and logged in our logs and we wanted to plot graphs via our already running Graphite
service. There are two ways to approach this issue. You can directly send statistics via graphite
library for your language of choice. For example: it is super simple to set it up and send metrics in Python
via graphite send. But, it requires adding extra code to your business application to send metrics to Graphite. My personal strong opinion is your application should be responsible for doing business logic and not neccessarily sending stats to Graphite
.
Enter logstash! You can send statistics in your logs via logstash
to graphite
.
Here's the config broken down to smaller chunks:
Read the log file with multiline codec:
input {
stdin { }
file {
path => "/var/log/your_app/your_app.log"
start_position => beginning
# unless the line starts with a digit club it into previous line
codec => multiline {
'negate' => true
'pattern' => '^\d'
'what' => 'previous'
}
}
}
Here we are setting up logstash
to read from a file
: /var/log/your_app/your_app.log
. Lines 7-11 setup the input
config with multiline codec that basically appends current line to previous line unless the current line starts with a digit. The effect of this codec is to have stacktraces all condensed into one line. Another thing of note is we are starting from beginnning of the log file: start_position => beginning
Parse each log line to look for certain patterns:
filter {
grok {
match => [ "message", "Number of requests:%{SPACE}%{NUMBER:TotalReq},%{SPACE}Successes:%{SPACE}%{NUMBER:TotalSuccess},%{SPACE}Errors:%{SPACE}%{NUMBER:TotalErrors}.%{GREEDYDATA:LogMessage}" ]
add_tag => [ "API_Stats", "Regular_Logs" ]
}
grok {
match => [ "message", "API errors:%{SPACE}
Total Errors: %{NUMBER:API_TotalErrors}
\[400: %{NUMBER:API_400_Errors},
401: %{NUMBER:API_401_Errors},
404: %{NUMBER:API_404_Errors},
4xx: %{NUMBER:API_4xx_Errors},
500: %{NUMBER:API_500_Errors},
5xx: %{NUMBER:API_5xx_Errors},
others: %{NUMBER:API_Others_Errors}\]" ]
add_tag => [ "API_Error_Stats", "Regular_Logs" ]
}
}
In the filter block, we match each line against the grok
patterns in line 16 and 22 - 30. Let's look at line no: 16 first. If a log line matches the pattern, then it is tokenized to capture TotalReq
, TotalSuccess
, TotalErrors
, LogMessage
and adding tags API_Stats
, Regular_Logs
. Similarly, if a line matches grok pattern on line number 20, then it is tokenized into:
- API_TotalReq
- API_TotalSuccess
- API_TotalErrors
- API_400_Errors
- API_401_Errors
- API_404_Errors
- API_4xx_Errors
- API_500_Errors
- API_5xx_Errors
- API_Others_Errors
What's most important is to remember that the actual values are stored in the tokens mentioned above; which means we can send these tokens to graphite to plot values. That's what follows next.
Send the tokens to graphite:
output {
if "API_Stats" in [tags] {
stdout { codec => rubydebug }
graphite {
host => "10.11.12.13"
port => 2003
metrics => [
"environments/staging/servers/stga-API/overall/total_requests", "%{TotalReq}",
"environments/staging/servers/stga-API/overall/total_success", "%{TotalSuccess}",
"environments/staging/servers/stga-API/overall/total_errors", "%{TotalErrors}"
]
}
} else if "API_Error_Stats" in [tags] {
stdout { codec => rubydebug }
graphite {
host => "10.11.12.13"
port => 2003
metrics => [
"environments/staging/servers/stga-API/API/total_errors", "%{API_TotalErrors}",
"environments/staging/servers/stga-API/API/total_400_errors", "%{API_400_Errors}",
"environments/staging/servers/stga-API/API/total_401_errors", "%{API_401_Errors}",
"environments/staging/servers/stga-API/API/total_404_errors", "%{API_404_Errors}",
"environments/staging/servers/stga-API/API/total_4xx_errors", "%{API_4xx_Errors}",
"environments/staging/servers/stga-API/API/total_500_errors", "%{API_500_Errors}",
"environments/staging/servers/stga-API/API/total_5xx_errors", "%{API_5xx_Errors}",
"environments/staging/servers/stga-API/API/total_others_errors", "%{API_Others_Errors}"
]
}
}
}
With the graphite
output it's pretty straightforward to send logs to graphite
. All you need to provide is the host
, port
and metrics
tags. The metrics
is basically a list of elements with the first element pointing to the path in graphite where the stats will be visible followed by the values. For example: "%{API_TotalErrors}"
is the value of total errors in the API and will be sent to: environments/staging/servers/stga-API/API/total_errors
.
For sake of completeness here's the full configuration with important parts highlighted.
<code data-gist-id="48d9a423efa7e098d05f"data-gist-highlight-line="4-5,16,20,32-34,43-46">
That's it! And your stats are seemlessly available in graphite. Have you tried graphite with logstash? What's your experience? Did you have a different configuration?