2015-11-30

Remembering the Glaciers

This week in Paris is the last chance for any kind of international agreement to do anything at all that to reduce the effects of global warming. There may be other conferences, later: but by then the topic of the conversation could be "it's too late". Today it's probably already too late to stay in that 2C zone, but it's not too late to try.

1991 zermatt ski panorama

I do not have any faith in my own country's politicians, given their actions this year:
  1. Essentially banning on-shore windmills through tightening planning permissions and allowing central government to block any proposals.
  2. Encouraging fracking through tax breaks and changing planning permissions to allow central government to approve any proposals.
  3. Extending the tax on carbon-generated electricity to —wait for it— renewable energy sources.
  4. Recently: killing Carbon Capture and Sequestration funding, despite it being the historic excuse for not needing to move away from coal and gas power, "CCS will handle it"
 +more. It's not a good thing to say that the Chinese government is now clearly more forward thinking on CO2 and other environmental issues of its populace than a European democracy.

For the week then; a photograph of a glacier a day. If the politicians —including our own— don't act, photographs will be all future inheritors of the planet will have of them.

Today: snow and ice above Zermatt, Switzerland, in winter 1991; skiing nearby. Not sure of the peak, but as Zermatt would be to the right of this frame (we were staying on the eastern side of the valley; this is looking south) —this could be part of the Monte Rosa group.

B&W negatives (Ilford, presumably), on compact 35mm camera, manual D&P with this image being zoomed in on part of the frame, burned in sky manually. Pre photoshop, you used you have to wave bits of cardboard above the print to get the sky dark enough.

2015-11-17

Well, what about Groovy then?


Following on from my comments on Scala, -what about Apache Groovy?

Stokes Croft Graffiti, Sept 2015

It's been a great language for writing tests in, especially when you turn on static compilation, but there are some gotchas.
  1. It's easy to learn if you know Java, so the effort of adding it to a project is straightforward.
  2. It's got that auto log.info("timer is at $timer") string expansion. But the rules of when they get expanded are 'complex'.
  3. Lists and Maps are in the language. This makes building up structures for testing trivially easy.
  4. The maven groovy task is a nice compiler for java and groovy
  5. Groovydoc is at least 10x faster than javadoc. It makes you realise that javadoc hasn't had any engineering care for decades; right up there with rmic.
  6. Its closeness to Java makes it easy to learn, and you can pretty much paste Java code into groovy and have it work.
  7. The groovy assert statement is best in class. If it fails, it deconstructs the expression, listing the toString() value of every parameter. Provided you provide meaningful string values, debugging a failed expression is way easier than anything else.
  8. Responsive dev team. One morning some maven-always-updates artifact was broken, so a stack trace followed the transition of countries into the next day. We in Europe hit it early and filed bugs -it was fixed before CA noticed. Try that with javac-related bugs and you probably won't have got as far as working out how to set up an Oracle account for bug reporting before the sun sets and the video-conf hours begin.
As I said, I think it's great for testing. Here's some test of mine

@CompileStatic     // we want this compiled in advance to find problems early
@Slf4j             // and inject a 'log' field for logging

class TestCommonArgParsing implements SliderActions, Arguments {
    
    ...
  @Test
  public void testList1Clustername() throws Throwable {
    // note use of [ list ]
    ClientArgs ca = createClientArgs([ACTION_LIST, 'cluster1'])
    assert ca.clusterName == 'cluster1'
    assert ca.coreAction instanceof ActionListArgs
  }
  
  ...
  }
What don't I like?
  1. Different scoping rules: everything defaults to public. Which you have to remember when switching languages
  2. Line continuation rules need to be tracked too ... you need the end of the line to be unfinished (e.g trail the '+' sign in a string concatenation). Scala makes a better job of this.
  3. Sometimes IDEA doesn't pick up use of methods and fields in your groovy source, so the refactorings may omit bits of your code.
  4. Sometimes that @CompileStatic tag results in compilation errors, normally fixed by commenting out that attribute. Implication: the static and dynamic compiler are 'different'
  5. Using == for .equals() is again, danger.
  6. The notion of truthyness in comparisons is very C/C++-ish: you need to know the rules. All null values are false, as are integer values that are zero And strings which are empty. Safe strategy: just be explicit.
  7. It's not so much type inference as type erasure, with runtime stacks as the consequence.
  8. The logic about when strings are expanded can sometimes be confusing.
  9. You can't paste from groovy to java without major engineering work. At least you can —more than you could pasting from Scala to Java— but it makes converting test code to production harder.
Here's an example of something I like. This is possible in Scala, Java 8 will make viable-ish too. It's a test which issues a REST call to get the current slider app state, pushes out new value and then spins awaiting a remote state change, passing in a method as the probe parameter.

public void testFlexOperation() {
    // get current state
    def current = appAPI.getDesiredResources()

    // create a guaranteed unique field
    def uuid = UUID.randomUUID()
    def field = "yarn.test.flex.uuid"
    current.set(field, uuid)
    appAPI.putDesiredResources(current.confTree)
    repeatUntilSuccess("probe for resource PUT",
        this.&probeForResolveConfValues, 
        5000, 200,
        [
            "key": field,
            "val": uuid
        ],
        true,
        "Flex resources failed to propagate") {
      def resolved = appAPI.getResolvedResources()
      fail("Did not find field $field=$uuid in\n$resolved")
    }
  }

  Outcome probeForResolveConfValues(Map args) {
    String key = args["key"]
    String val  = args["val"]
    def resolved = appAPI.getResolvedResources()
    return Outcome.fromBool(resolved.get(key) == val)
  }

That Outcome class has three values: Success, Retry and Fail; the probe returns an Outcome and the executor, repeatUntilSuccess execs the probe closure until the Duration of retries exceeds the timeout, the probe succeeds, or a Fail response triggers a fail fast. It allows my tests to iterate until success, but if that probe can detect an unrecoverable failure —bail out fast. It's effective at avoiding long sleep() calls, which introduce needless delays to fast systems, and which are incredibly brittle in slow ones.

If you look at a lot of Hadoop test failures, they're of the form "test timeout on Jenkins", with a fix of "increase sleep times". Closure-based probes, with probes that detect unrecoverable failures (e.g. network unreachable is a hard failure, different from 404, which may go away on retries). Anyway: closures are great in tests. Not just for probes, but for functions to call once a system is in a desired state (e.g. web server launched).

We use Groovy in Slider for testing; the extra productivity and those assert statements are great. So does Bigtop —indeed, we use some Bigtop code as the basis for our functional tests. But would I advocate it elsewhere? I don't know. It's close enough to Java to co-exist, whereas I wouldn't advocate using Scala, Clojure or similar purely for testing Java code *within the same codebase*.

There's also Java 8 to consider. It is a richer language, gives us those closures, just not the maps or the lists. Or the assertions. Or the inline string expansion. But...it's very close to production code, even if that production code is java-7 only -and you can make a very strong case for learning Java 8. Accordingly,

  1. I'm happy to continue with Groovy in Slider tests.
  2. For an app where the production code was all Java, I wouldn't advocate adopting groovy now: Java 8 offers enough benefits to be the way forward.
  3. If anyone wants to add java-8 closure-based tests in Hadoop trunk, I'd be supportive there too.

2015-11-06

Prime numbers —you know they make sense


The UK government has just published its new internet monitoring bill.

Sepr's Mural and the police

Apparently one of the great concessions is they aren't going to outlaw any encryption they can't automatically decrypt. This is not a concession: someone has sat down with them and explained how RSA and elliptic curve crypto work, and said "unless you have a ban on prime numbers > 256 bits, banning encryption is meaningless". What makes the government think they have any control over what at-rest encryption code gets built into phone operating systems or between mobile apps written in California. They didn't have a chance of making this work; all they'd do is be laughed at from abroad while crippling UK developers. As an example, if releasing software with strong encryption were illegal, I'd be unable to make releases of Hadoop —simply due to HDFS encryption.

You may as well assume that nation states already do have the abilities to read encrypted messages (somehow), and deal with that by "not becoming an individual threat to a nation state". Same goes for logging of HTTP/S requests. If someone really wanted to, they could. Except that until now the technical abilities of various countries had to be kept a secret, because once some fact about breaking RC4 or DH forward encryption becomes known, software changes.

What is this communications bill trying to do then? Not so much legalise what's possible today, but give the local police forces access to similar datasets for everyday use. That's probably an unexpected side-effect of the Snowden revelations: police forces round the country saying "ooh, we'd like to know that information", and this time demanding access to it in a way that they can be open about in court.

As it stands, it's mostly moot. Google knows where you were, what your search terms were and have all your emails, not just the metadata. My TV vendor claims the right to log what I watched on TV and ship it abroad, with no respect for safe-harbour legislation. As for our new car, it's got a modem built in and if it wants to report home not just where we've been driving but whether the suspension has been stressed, ABS and stability control engaged, or even what radio channel we were on, I would have no idea whatsoever. The fact that you have to start worrying about the INFOSEC policies of your car manufacturer shows that knowing which web sites you viewed is becoming less and less relevant.

Even so, the plan to record 12 months of every IP address's HTTP(S) requests, (and presumably other TCP connections) is a big step change, and not one I'm happy about. It's not that I have any specific need to hide from the state —and if I did, I'd try tunnelling through DNS, using Tor, VPNing abroad, or some other mechanism. VPNs are how you uprate any mobile application to be private —and as they work over mobile networks, deliver privacy on the move. I'm sure I could ask a US friend to set up a raspberry pi with PPTP in their home, just as we do for BBC iPlayer access abroad. And therein lies a flaw: if you really want to identify significant threats to the state and its citizens, you don't go on about how you are logging all TCP connections, as that just motivates people to go for VPNs. So we move the people that are most dangerous underground, while creating a dataset more of interest to the MPAA than anyone else.

[Photo: police out on Stokes Croft or Royal Wedding Day after the "Second Tesco Riot"]