ulimit

I had downloaded a open source tool pylint. When I got it to run, I was always given this stack trace “Too many open files”. After searching for a clue on the web, I was pretty positive too many file descriptors created by the tool since I saw a similar problem. Wanting to be sure I did:

$ watch lsof +D .

while running the tool. It looked right the output filled up the screen.

With the help of my coworkers, I discovered that there is a setting to increase the max opened files allowed bye a process. The command is

$ ulimit

It’s a bash command to limit system-wide resources. Another thing that is new to me is that there is a concept of soft limit and hard limit. Not sure why there are such things, I seem to notice that the soft limit is what preventing pylint to work on huge number of files. It seems to me that the soft limit is the real max value that a user can set.

Below are ways to find out the lower and upper extremes.

$ ulimit -a -H
$ ulimit -a -S

Let say the soft limit for open files is 1024, you cannot do $ ulimit -n 1025 because that is greater than the soft limit. That was a problem for me since I needed more so what I need to do first is to increase the soft limit and then use ulimit to increase it for my session.

After more research, I found a solution.

Basically, since I’m on Mac Leopard I need to do:

$ launchctl limit maxfiles 10240 unlimited

or set it directly in:

/etc/launchd.conf

and then:

$ ulimit -n 10000

That solves my problem. Aside from that I probably need to ask pylint developers why it needs to open so many files because it could be I am using it wrong but at least I discovered something new.

Some investigation into result logs from firefox 3.5 tests

Firefox 3.5 unit tests:

http://tinderbox.mozilla.org/showbuilds.cgi?tree=Firefox3.5-Unittest

Log types: EverythingElse (reftest, crashtest, mochitest-chrome, xpcshell, mochitest-a11y, browser-chrome), Mochitests (plain)

OSes: Linux, OS X 10.5.2, WINNT 5.2

Everything else

Reftest:

Multiple test units per file

TEST-PASS numbers don’t add up (possibly number provided by tinderbox is the number of files instead of test units)

Crashtest:

TEST-PASS numbers don’t add up

Tests appear once

One test unit per file

REFTEST TEST-PASS | file:///builds/slave/mozilla-1.9.1-linux-unittest-everythingelse/build/reftest/tests/gfx/src/mac/crashtests/306902-1.xml | (LOAD ONLY)

Mochitest-chrome:

Multiple test units per file

TEST-PASS & TEST-KNOWN-FAIL numbers add up

INFO TEST-KNOWN-FAIL | chrome://mochikit/content/chrome/toolkit/content/tests/chrome/test_preferences.xul | instant pref init file

INFO TEST-PASS | chrome://mochikit/content/chrome/toolkit/content/tests/chrome/test_preferences.xul | instant element init int

Xpcshell:

Tests appear once

One test unit per file

Numbers add up

TEST-PASS | /builds/slave/mozilla-1.9.1-linux-unittest-everythingelse/build/xpcshell/tests/test_zipwriter/unit/test_bug446708.js | test passed

Mochitest-a11y:

Multiple test units per file

Test units often appear more than once

Numbers add up

INFO TEST-PASS | chrome://mochikit/content/a11y/accessible/test_actions_aria.html | No actions on the accessible for ‘clickable’

INFO TEST-KNOWN-FAIL | chrome://mochikit/content/a11y/accessible/test_nsIAccessibleTable_1.html | rowDescription should not throw

Browser-chrome:

Multiple test units per file

Test units often appear more than once

TEST-PASS | chrome://mochikit/content/browser/browser/base/content/test/browser_alltabslistener.js | Got an expected notification for the front notifications listener

TEST-PASS | chrome://mochikit/content/browser/browser/base/content/test/browser_alltabslistener.js | Got a notification for the front notifications listener

TEST-PASS | chrome://mochikit/content/browser/browser/base/content/test/browser_alltabslistener.js | onStateChange notification came from the correct browser

Mochitests

15867 INFO TEST-PASS | /tests/content/canvas/test/test_2d.line.width.transformed.html | pixel 86,25 is 0,255,0,255; expected 0,255,0,255 +/- 0

15863 INFO TEST-PASS | /tests/content/canvas/test/test_2d.line.width.transformed.html | pixel 15,25 is 0,255,0,255; expected 0,255,0,255 +/- 0

15856 INFO TEST-PASS | /tests/content/canvas/test/test_2d.line.width.transformed.html | pixel 16,25 is 0,255,0,255; expected 0,255,0,255 +/- 0

15870 INFO TEST-KNOWN-FAIL | /tests/content/canvas/test/test_2d.missingargs.html | should throw NOT_SUPPORTED_ERR

Multiple test units per file

Test units only appear once

Numbers off by a bit

Possible regexes:

Re.compile(r’REFTEST TEST-.+crashtests’) // look for “REFTEST TEST-” followed by any character any count and followed by “crashtests”

What is a mock object?

Someone recommended reading this presentation about Unit Testing at StackOverflow.

First of all I was amazed by the fact that FF has this hidden slides feature. I wondered what tool was used by the author to make the slides. Does anyone know?

Anyway, it is an interesting set of slides on unit testing. And in there, there are some slides on Mock objects. I haven’t fully understood what they are yet, but I think Mocks are objects in which expectations can be set to their interfaces. The expectations are based on known specifications. The reason is that so that tests can be written for the specifications before they are implemented. Right now, I am still not sure the benefit of doing this yet.?.?. This is probably not new to many people but this is the first time I heard about it. It sounds like a useful concept. If anyone can point me to good readings/articles/books on mock objects, I will definitely appreciate it.

Implementing the functionalities of resultserver

On one of my previous posts, one of the remaining items to do is to have server generated web pages to fetch reports from the data storage and to bring it to the user end. For this implementation, the project will be using brasstacks, couchquery, and webenv. So far web pages for simple views have successfully been implemented using brasstacks. As a result, the focus now is on the scenarios how the information will be used.

Currently focusing on:

  • given a build, finding its last previous build.
  • given two builds, finding the difference at individual test level as mentioned in my last post.

To find the previous build, my idea, which I don’t know if it will be feasible or not, is to have a view where the keys will be the necessary metadata and the value to be the build id. When querying the view, filter the rows with the metadata of the current build. The metadata that identifies a particular build will be the product, the OS, and the test type. The question is whether all these can query arguments for key= like (key=[product, os, testtype]). So far from what I have seen I can set the key to be doc.build such as emit(doc.build, doc) and set the key to be a specific build by specifiying key=”sample_build_id”.

To get the previous then, one approach is to iterate the rows and find the timestamp that matches with the timestamp of the current build and return the row after that which would then be its previous build. We’ll see how it goes once couch.io is back up.

Comparing Test Results

Some thoughts on how to compare two test results from two builds.

If only one build is chosen, find a similar build based on product, OS, test type that was run before the current build.

Comparing two rows (test file i.e. from let say build 1 and build 2), here are some of the info that need to be extracted:

  • new tests: tests that are included in build 1 but never included in build 2 (when number of verifications >)
  • new fails: new tests that failed
  • new passes: new tests that passed
  • new todos: new tests that are marked as TODO
  • missing tests: tests that are missing in build 1 but were included in build 2 (when number of verifications <)
  • missing fails: [3, 2, 0] vs [3, 5, 0]
  • missing passes: [3, 0, 0] vs [6, 0, 0]
  • missing todos:
  • stable tests: tests that have same number of verifications [3, 0, 0] vs [3, 0, 0]
  • previously didn’t pass: stable tests that did not pass in build 2 [3, 0, 0] vs [0, 2, 1]
  • previously didn’t fail: stable tests that did not fail in build 2 [0, 3, 0] vs [0, 3, 0]
  • previously not todo: stable tests that were not todos in build 2
  • new test files (test files that were never included in build 2)
  • missing test files (test files that were included in build 2 but missing in build 1)

Each result will show the count difference and the fail notes if any.

How Parsing is Done

On my last post, I talked about the Comparing Test Results project that I am currently working on. In that post, it also mentioned parsing of logs to extract useful information. So far this has already been implemented and tested with data from Fennec builds. The intention of this post is to explain how the parsing is done so it can be a reference for improvements that may need to be made to extend this project to other builds.

There are two parts to the parsing. The first part is to look at the entire log and to find metadata used to characterize and identify a set of data. Here the code is looking for: the build ID, the product, the operating system, and the type of tests that were run.

  1. To get the build ID, the code is looking for the line that contains the string “BuildID=” such as this “BuildID=20090715053040″. The information to be extracted will then be whatever after the character ‘=’.
  2. Similarly for the product, the code is looking for the string ”Name=”.
  3. For the OS and test type, the code is looking for the line that contains the string “tinderbox: build: ” such as this “tinderbox: build: Maemo mozilla-central crashtest”. The OS will be taken from the first word after the string “tinderbox: build:” and the test type will be taken from the last word.

The second part is to parse the log per line to find lines that contain the output of a test run. On each line, if it’s an output of a test run, it is looking for: the result of the test (PASS/FAIL/TODO), the location of the test code (normally the path to the test source file), and if any the message that may describe the intent of the test.

  1. To determine if a line is an output of a test the code is looking for either of these strings: “TEST-PASS”, “TEST-FAIL”, “TEST-UNEXPECTED-FAIL”, “TEST-TIMEOUT”, “TEST-KNOWN-FAIL”.
  2. The line is then split into three sections as separated by the divider ‘|’. The first section will determine the code will go into either of these conditions: the test passed, the test is marked as TODO, or the test failed.
  3. The second section is simply taken as the location of the test code. Since only the relative path is needed, the beginning is striped as it appears for ‘reftest’ and ‘xpcshell’.
  4. The third section is taken as the message, if any, from the test. It is taken if the test did not pass.
  5. The code also increments the count of the number of passes, failures, and todos.
  6. As a result, in any of the three conditions, the code will see if it is a new test. If it is, it is added into the data structure with a count of 1 and if it is not a test that passes, its message is added as well. Otherwise, it will update the count of an existing test and if it is not a test that passes, the message is appended and separated by a coma.

Everything mentioned is not final, still continuously being improved, and is open to suggestions and extensions. The post will be updated as the code is updated.

fail owned pwned pictures
see more Fail Blog