ERROR.

Recursos simbólicos para la Praxis cotidiana.

HTTP Archive (har) to CSV converter (har2csv)

This is a small command-line application written in Java for converting HAR (HTTP Archive) files to CSV (comma separated values). It’s useful for importing web pages statistics, from a browser extension to a spreadsheet, in order to do some analysis on sizes and timings.

As all Java applications, it’s cross-platform. All you need is a JRE (Java runtime environment) properly installed. I added two executable wrappers, one for Windows systems (har2csv.exe) and a Bash script for Linux boxes (har2csv).

Usage is pretty strightforward:

har2csv --in <source> --out <destination>

Or type “har2csv --help” to get more info.

These are the binaries (if you intend to use it right away):

hartools_20140311_2332.tar.gz

And here’s the source code, in case you want to study or modify the application:

hartools_20140311_2332_source.tar.gz

Binaries and source are also available fromGitHub, on this URL:

https://github.com/spcgh0st/HarTools

NOTE: I haven’t mapped all fields from HAR specification to columns in the CSV file, because I only needed a few ones. If you need any extra data exported, feel free to contact me, and I gladly update the code.

Update 2014/03/04@2251: added “startDateTime” and “time” from Entry object, to CSV output.

Update 2014/03/11@0045: added “Referer” from Request > Headers array, to CSV output.

Update 2014/03/11@2332: fixed a bug that made the conversion to CSV, ignore floating point values in HAR file (ie. in timings).


34 thoughts on “HTTP Archive (har) to CSV converter (har2csv)”

  1. stevecao says:

    Hi,

    Thanks for this excellent tool.

    However, when I was trying to convert my HAR file, I got the following feedback:
    [ERROR] Malformed HAR file.

    I didn’t edit the file manually and it was generated by IE developer’s tool.
    Can I add some characters to the head of the file to resolve this problem?

    Thanks.

  2. Flavio says:

    Hello, I downloaded the file and extracted it, but when I try to open the har2csv.exe it opens a command promt and immediately closes it
    I’m very new to all of this, is there something I’m doing wrong? Can you help me?
    Thanks in advance!

    1. Marco says:

      You need to run it through command line.
      Open your terminal/command line and navigate to the folder where the program is (on disk). Then call it:
      har2csv –in –out

      Is this self explanatory?
      M

  3. Keegan says:

    Are you aware that the output of this tool (“har2csv”) is actually a tab delimited file and not a CSV? Might be worth mentioning that somewhere at the top otherwise people might think you’re crazy. Cheers

  4. Sophinette says:

    Excuse me, I don’t understand how to use this tool? I’m on Windows 10 and when I click on the .exe file, a black MS-DOS window opens and closes immediately. Nothing happens if I ask Windows to open the .har file with the .exe file, I’m lost.

    Thanks in advance for your help.

    Best regards,
    Sophie.

  5. samon says:

    This tool doesn’t work at all in Windows 10. Any suggestions?

    1. Yamamoto says:

      Hi Samon, is it throwing any error that you could post here? Thanks u!

  6. RAJASHEKAR S says:

    Hi,

    Very good tool. Can you please extend to save, Total Number of request, total load time and onload time which is displayed at bottom of .har file to csv file.

    Thanks

    Rajashekar S

  7. Spencer says:

    Hi,

    When i run har2csv.exe i get the error:

    Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/wink/json4
    j/JSONException

  8. Yamamoto says:

    Henk, here seems to be working fine with Firefox 44.0.2 out-of-the-box debugger.

  9. Henk says:

    Doesn’t work, at least not on the latest firefox. Doesn’t do anything other than converting a list of urls.

  10. Julio C says:

    Great Job. Very usefull

  11. -KaaL- says:

    Can you plesae include the field called ‘ip’ that is besides URL. Thanks and appreciate for this tool.

  12. Yamamoto says:

    Hi Cris, Ant’s build.xml compiles the sources and creates the Windows executable, but still needs a few fixes because it doesn’t put all things together.

    The Linux executable is a Bash script that is placed in the “resources” folder. Once you’ve compiled the sources, you should move it to the root directory, along with the JAR that resulted from compilation.

    In GitHub, this is the path to the binaries. There you can check out the expected directory structure.

    https://github.com/spcgh0st/HarTools/releases

  13. chris hammond says:

    Where is the Linux version? I did not see on github.

  14. Andrew says:

    Hola,

    your tools is the most useful I found until now to convert HAR to CSV.
    Great job !

  15. Peter says:

    Hallo,

    Very useful, exactly what I was looking for.

    What are the license T&C?
    Can this be used for commercial purposes too?

    Many thanks,
    Peter

    1. Yamamoto says:

      Hello Peter. Yes, you can use the application for commercial purposes. It’s released under a BSD license (except for the JSON4J library, released under the Apache License 2.0).

      Thanks for your comment!

  16. Kothai says:

    Thanks for your time and tips. I will try to implement these metrics.

  17. Yamamoto says:

    Kothai :) First off, I apologize for the delay in my reply.

    Unfortunately, I can’t help you with that request. You need global whole-page values, while I’m parsing particular entries within a page. The intention of this application is to help dissect a page to analyze how many resources is consuming each part of it, and not to compare different pages, as a whole, with each other.

    That said, the values you’re asking for, either are singular identifiable elements in the input file, or can be obtained applying different calculations to the entries exported to CSV (and you can do that using a spreadsheet).

    1. OnLoad time: is a discrete singular value in the HAR file. It’s a property of the Page object (in Javascript syntax: pages[0].pageTimings.onLoad).
    2. Total time taken to load the page: this is the time elapsed between the startedDateTime of the HTML page (the first exported entry), and the higher startedDateTime + Time (that is, the ending time of the resource that finished last).
    3. Size from cache: is the summation of the Response Content Size of all the element which Response Status is 304.
  18. Kothai says:

    Hi Yamamoto,

    I need the onload time and the total time taken to be exported into the csv. Please help on this.

  19. Sachin says:

    Hi Yamamoto,

    Thank you for the pointer. I will try to implement that.

    Sachin

  20. Yamamoto says:

    Hi Sachin, thanks for your comment. You can iterate all HAR files in a directory and append the results to a single file, this way:

    Linux:

    for i in *.har; do har2csv -i $i >> output.csv; done

    Windows:
    (according to this post, I haven’t tested it)

    for /r %i in (*.har) do har2csv -i %i >> output.csv

    NOTE: you will have to remove the extra heading rows, in the final CSV file.

  21. Sachin says:

    Hi,

    Very good tool. Can you please extend to parse all the HAR files in a single directory and append the data to single excel?

    Thank You!

  22. Yamamoto says:

    Yes, I’ve noticed the same, “time” column seems to be the sum of all timings (for exporting, thou, I just take that value from the HAR file; I’m not doing any arithmetic).

    With SSL (as with all numerical values) if the value is zero or -1 I leave it blank in the CSV file. However, I’ve just found a bug in the code: it’s expecting timings to be integers, but they can also be floating points, and it’s exporting as blank those which are not integers. I’ll fix it later today.

    Update: bug fixed.

  23. l1ammy says:

    Just checked it, all the time columns seem to add up to time column

    I assumed the Time SSL column is left as null unless there is an actual SSL connection request, and not filled otherwise

    thanks again for the handy tool

  24. jean says:

    Thanks, got it. Just a stupid mistake in unzipping put all files to the root folder. Now working fine. Many thanks again!

  25. Yamamoto says:

    Hi Jean, that error means that the app is not finding a required library that is included in the “lib” folder, in the archive I uploaded. Try unpacking the whole content of the archive before running har2csv.

  26. jean says:

    Hi! Thanks for the very rapid reaction! However, I have trouble running it and get the error:

    “Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/wink/json4j/JSONException”

    Sorry if this is only due to the user, I’m not that tech-savvy…

  27. Yamamoto says:

    Hi there :) I uploaded a new version that exports the Referer value. I couldn’t test it out thoroughly, but I think it’s working fine (otherwise, let me know, please). Thanks for your comment!

  28. jean says:

    Very useful tool. Any chance you could add the Referer field to the CSV output as well?

  29. l1ammy says:

    Many thanks for that.

    Tested it out and is working well.

  30. Yamamoto says:

    Hi :) I’ve just added “startDateTime” and “time” to the CSV output, and updated the links. If you need anything else, let me know.

    Thanks for the comment!

  31. l1ammy says:

    Love the tool
    Would you be willing to include “startedDateTime”?
    I am using firebug to export the HAR, and would like to link it up to the absolute date and time and tcp.request.full_uri in wireshark

    There are some slight time differences but a bit of rounding would fix that, and it is unlikely to get two HTTP requests exactly the same withing less than a tenth of a second.

Leave a Reply

Your email address will not be published. Required fields are marked *