Devoxx - Day 3

The rest of Citrus posse (Christoph, Martin, Ralf, Torsten) joined us (Marcel, Roland) yesterday, so we are able to spread over much more talks and will flood this blog with even more reports from interesting Java talks.

Please note, that we won’t cover the keynotes about Java 7 & 8, State of the Web and Roadmap for JEE 7 since these are already covered in depth elsewhere. We are concentrating on the small pearl with a broad mix, also in order to give an impression about the trends going on in the Java world. BTW, the keynotes and selected talks will also be available as streams from Parleys.

Next big JVM language (Stephen Colebourne)

Wow, what a fast, enjoying ride. Starting with what is the definition of a big language, what are big languages, where does Java come from, what got Java right, what got it wrong. From the lessons learned, a set of feature were extracted which should be avoided. In Java, new features has been compromised by exisiting features. So, for the core Java languages only small, evolutionary changes are possible. For the next NBJL, a balanced mixture of functional and OO features is needed. It will have a static, non-excessive type system. And other features, learned from the experiences with Java gained over the last decade. Finally Stephan compared the existing JVM languages out there and checked out whether they match the criterias setup. Clojure is not the NBJL since its too foreign to Java. Groovy neither, because its dynamic. It’s a good complement to Java but not a replacement. Final, the smack down came to Scala versus Fantom. Scala is a big jump from Java, whereas Fatom is a smaller step. So, the result is, that Scala is too complex with chances to become a write-only language. Fantom is closer to the NBJL, but the type system is not complex enough (e.g. no generics). Since neither Clojure, Groovy, Scala or Fantom meet all criteria, Stephan suggest to consider a new Java, giving up backwards compatibility. Maybe this could be JDK 9 ?

Entertaining talks with good arguments, but of course biased. A set of good and viable criterias has been presented which a new language needs to be measured against before it can become the next big JVM language.

Introduction to Cassandra (Jonathan Ellis)

Cassandra is a distributed, scalable database without a single point of failure. It combines the replication model from Dynamo (Amazon) and data model from Bigtable (Google). Cassandra’s focus is scaling and performance. It uses a Log-Structured Merge-Tree to avoid random IO on rows. There are various degrees of tuneable consistency levels. You don’t have to deal with stale data, if you don’t what to. Cassandra is fully monitorable via JMX. Since all column names are stored along with each name, schema changes can easily be performed. Dynamic column names can be used to do sophisticated queries (e.g. using column names as foreign keys, materialized queries). With so called SuperColumns (columns containing other columns), full denormalization is realized. Then he lost me ;-/

Quite complex stuff, quite a change of mindset is required to get the switched. It seems, that MonoDB has a smoother migration experience than Cassandra for someon coming from a RDBMS background. But there seems to be some highlevel API on the horizon (e.g. an JPA implementation) which might hide the complexity.

Spring 3.1 (Jürgen Höller)

The talks starts with a recapitulation of Spring 3.0 themes. Annotations everywhere: With the enhanced stereotype model, it is possible to combine multiple annotations into a single on by creating a custom annotation which itself is annotated with the Spring annotations it should represent. Expressions are available in component annotations, too which are especially useful with the @Value annotation. The declarative model validation allows for powerful validation via annotations. With @Scheduled methods can be marked with a cron expression.

Themes coming in 3.1 are:

  • Enviroment profiles for beans
  • Java based application configuration
  • Cache abstraction
  • Conversation management
  • Servlet 3.0, JSF 2.0, Groovy

All in all, nothing spectacular but a solid progress. The talk itself was a bit dry and anything but a firework. I appreciate Juergen’s technical style, though.

Java Persistence 2.0 (Linda DeMichiel)

This talk covered the differences between the 1.0 and 2.0 releases of JPA.

The main additions covered included:

exposing a strongly typed, object based API. I found the queries writter with the new criteria API much harder to read than plain JPQL queries, making the code less readable. On the other hand the API is definitely much more suitable for writing dynamic queries where the sort order or the filters are not known in advance.

  • 2nd level cache support: JPA beans can be annotated with @Cacheable and the cache is exposed via an API allowing it to queried and to evict entities from the cache at runtime. There is however no specification for cache providers, meaning that the developer still needs to be aware of the underlying cache provider to use and configure it properly (eviction times for entities)
  • Pessimistic Locking: Three new locking modes added: PESSIMISTIC_READ, PESSIMISTIC_WRITE and PESSIMISTIC_FORCE_INCREMENT. A developer needs to be aware that different database vendors support locking modes differently. For example a PESSIMISTIC_READ lock in JPA may result in an exclusive lock being acquired in one database and a read lock in another. Also the exceptions thrown and how they influence running transactions can differ from one database vendor to anther - an exception can result in a statement being rolled back or in a transaction rollback.

Other new features touched on included

  • Embeddables: For persisting entities that have no identity (they are always owned by a parent entity)
  • Orphan Deletion: a new annotation for relationships that replaces CASCADE.DELETE from JPA 1.0
  • Validation: Allows properties of entities to be annotated with constraints (for example to specify valid ranges, min & max values)

There are definitely some nice additions in the 2.0 version, like the Criteria API, Validation and better support for legacy databases, however it seems that JPA 2.0 still hasn’t managed to shield the developer completely from differences in the underlying OR Mapper implementations.

Overall it wasn’t a particularly interesting session. The material was read off the slides and there were no insights privided into the next JPA 2.1 version.

Reflection Madness (Heinz Kabutz):

Heinz showed by using many code examples (tricks) that reflection can be a very useful and flexible but also a dangerous thing to play with.

The dangers ly here for instance:

  • Complex code
  • Static compiling does not find typical errors (e.g. code written in XML and converted)
  • Runtime performance

Reflection should not be used very often but there are some use cases where reflection really makes sense. Heinz analyzed some use cases, where reflection really could make sense (he also showed alternatives to reflection):

  • Finding out the size of an object
  • Finding out the caller of an object
  • Automatic delegator - if you want to delegate inside of an API which is immutable but can’t because the methods you have to invoke are “protected”
  • Constructing objects without constructor (for singletons for example)

and others.

Heinz warned not using reflection on non “static” but “final” fields because that wouldn’t have the expected effect. The reason is that “final” fields if they are bound at compile time will get inlined. That’s why you cannot expect that you have changed every occurance of that field in the code.

State of Hadoop (Tom White)

“Failure is a given”

Tom is the author of the book “Hadoop: The Definitive Guide” (O’Reilly). That’s why I expected him to be very competent in this field. Tom showed on a more high level way what is Hadoop is all about. Which means how it works and what eco system does belong to Hadoop. If you weren’t familiar with this theme (like me) you got an impressive insight into this complex world of Cloud and NoSQL.

In Hadoop we talk about clusters not machines.

For example:

  • 5 - 4000 commodity servers
  • 8 cores, 16-24GB RAM …
  • Two-level nework topology (20 - 40 nodes per rack)

Some other remarkable notes are:

HDFS in a nutshell:

  • streaming data access for very large files
  • files broken into blocks of 64MB or larger

Failure Modes:

  • datanode crash

    • clients read another replica
    • namenodes instructs datanodes to re-replcate
  • namenode crash

    • downtime (but bugs, configuration/human errors are worse)

MapReduce in a nutshell:

  • sort/merge is the primitive (operates at transfer rate)
  • batch-oriented (not for online access)
  • ad hoc queries (no schema)
  • distribution handled by the framework (you just supply map() and reduce())

Mapping the Hadoop Ecosystem:

  • HBase - a distributed column-oriented datastore (facebook uses it already)
  • HDFS - a distributed filesystem for high-throuhput ccess
  • ZooKeeper
    • Hive - a distributed data warehouse with a SQL-like QL
    • Pig - a dataflow language for large datasets
    • Mahout - a scalable machine learning and data mining

It was good introduction of Hadoop for someone like me who wasn’t familiar with this matter. And that was surely the intention of Tom to reach people like me which he had actually. For a specialist it would have been surely too high level.

Performance Anxiety by Joshua Bloch (Chief Java Architect at Google)

I was the only guy of the citrus posse attending this talk - I went there mainly because of the speaker and not because of the topic itself.

My guess is that many others of the audience joined the talk for the same reason. I was very lucky finding a free seat fifteen minutes before the start - the most crowded talk so far.

In fact Mr. Bloch wanted to begin five minutes ahead of schedule but he wasn’t allowed - which was quite amusing.

The talk was very short - only half an hour instead of the scheduled hour - and had basically one message:

As systems get faster they get less predictable, so the only way to know what the performance of a system is like, is to measure it and apply statistics on the measurements.

Ok, got the message.

He explained why nowadays almost always using the & operator in Java is faster than using && - I didn’t really understand it in detail but the main point is that parallel processing of all parts of the & statement might faster lead to a result than evaluating possibly each part - one after another.

Mr Bloch then tried to demonstrate the unpredictability of execution times by a small example - a simple sort implementation which should vary in execution times at a range of about 20%. The example didn’t work - execution times were almost equal all the time. hmm… I belive him anyway.

He also mentioned that profilers are useless. Four different profilers identified four different hotspots in a test - he said. hmm… not sure if I belive him anyway.

As a solution to how developers should deal with their performance anxiety he proposed to just “live with it” - great idea!

This means measure your performance - do statistical work - repeat.

Finally he said that benchmarking is very hard, most benchmarks are useless and that the only guy on the planet who does it right is his colleague “Cliff Click” - sounds like a comic hero name to me. There’s also a framework - of course by google - which supports benchmarking. I didn’t had time to take a look at it but here’s the link.

To sum it up. The talk was quite entertaining, the message was clear but not very surprising. It was worth listening but not the best talk I attended so far…

Flex for Fun (Chet Hase)

Chet Hase talked about the new features in Flex4 and the major changes in this release. Basically his message was: Flex3 was brilliant for simple applications with a fine look and feel. A button looks like a button a textfield looks like a textfield. Flex4 ships with new effects and extended component models which opens a lot of great ways to customize your components. Though customization was also possible in Flex3 it was not very straight forward. So if you look at Flex4 you will find existing features and customizations that became much more easy to use. We will see this in some of Chet’s fine examples he has shown in his talk on Devoxx.

Graphics primitives

In Flex3 you need to go deep into the Flash layer in order to draw primitive graphics. Flex4 offers components in the Flex layer which resides to specific graphic XML tags for primitive graphics like line, rectangular, ellipse, etc.

<s:line xFrom="0"  xTo="250" yFrom="0" yTo="250">
  <s:stroke>
    <s.SolidColorStroke color="grey" weight="bold">
  <s:stroke>
</s:line>

You can easily add strokes and fills on these graphic objects with linear and radial gradients that are impressively easy to use on these objects. The graphical effects take a further step in filters.

Filters

The new Flex 4 component model is capable of listening to changes in filters and automatically react to these changes. The direct link between filter effect and component enables you to have nice manipulations on multiple components at runtime by applying changes to the filter object attached to those components.

<fx:Declaration>
  <s:BlueFilter id="blur"/>
</fx:Declaraion>

You can reuse this blur filter on several components. A change in the blur filter at runtime will automatically take effect on the components and cause a rendering refresh.

States and Transitions

States are my personal highlight in this talk. I have never seen this state mechanism before in Flex - as I am not a Flex expert. But this fact that as a Flex newbie I was able to understand all demos and code examples in Chet’s talk was very impressive and proove the ease of use in Flex4. State definitions and state transitions are defined in Flex layer using new XML tags. A change in the application state effects components on the application layer. Components may disappear, change its color, animate their appearance within a state transition. The cool stuff comes with the code examples as they are so easy:

<s:states>
  <s:State name="myState"/>
</s:states>

<s:label includeIn="myState"/>

<s:text x="250" y="300" x.myState="150" y.myState="200"/>

So in this example you can see how properties on components and visibility are linked to states in your application. The textfield has different x and y coordinates set for the state “myState”. What I like most on states is the combination with state transitions which put lovely animations on top of state changes:

<s:transitions>
<s:Transition toState="myState">
  <s:Sequence>
    <s:Fade target="myComponent"/>
    <s:Move target="myComponent"/>
  <s:Sequence/>
<s:Transition/>
<s:transitions/>

Effects

Flex4 is able to use the new 3D effects in Flash10 which adds fine 3D animations (Rotate3D and Move3D) to your Flex application. Effects in demo were cool and very easy to use on components. Just a few lines (tags) in Flex layer.

Component Skinning

Your component is able to point to alternate skin files. A skin fulfills a contract with the component so the skin has to support button specific states and properties for instance. Depending on the skinClass attached to your component you can change the look and feel very easy at runtime.

Summary

Chet’s talk was based on great demos and I liked it, because it showed the simple life in Flex4 with customization and component changes in Flex layer.

To scale or not to scale: Key/Value, Document, SQL, JPA – What’s right for my app? (Uri Cohen)

This talk covers NoSQL in general and gives a good introduction to the different data models supported.What is NoSQL? Its short for Not-Only-SQL, featuring a datastore with one or more of the following characteristics

  • Distributed
  • Scalable, up with eg more CPUs and out with more nodes
  • Non relational Unique and FK constraints are not practically enforceable when distributed
  • Not always ACID but BASE compliant BASE stands for basically Available, soft state, eventual consistency and provides an opposing strategy to ACID and expensive, distributed TXs.

Another question is Why now? The answer is exponential increase in non or semi structured data and throughput, which classical relational databases have problems in handling.

Cohen introduces the three most common NoSQL data models:

  • Key/Value or Tupel Good for cache aside (Hibernate 2nd level cache) and simple id based interactions such as user profiles Example: JBoss Cache, Membase, Gigaspaces
  • Column A giant table of rows and variable number of columns queried by row key o column value. Good for constantly changing but flat domain model Example: Google BigTable
  • Document Nested value containers Example: CouchDB, mongoDB

The talk finishes by quickly covering Gigaspaces, an In Memory Data Grid, with optional write behind to a secondary, durable storage. Well, no surprise since Cohen works for GigaSpaces - IMO the talk should have focused completely either on NoSQL or on GigaSpaces but not both.

What’s new in Scala 2.8 (Bill Venners, Dick Wall)

  • Bill … is also the lead developer and designer of ScalaTest, an open source testing tool for Scala and Java developers, and coauthor with Martin Odersky and Lex Spoon of the book, Programming in Scala
  • Dick … member of the JavaPosse

At the beginning both made clear that Twitter replaced Ruby by Scala for some backend work now. I think Scala is becoming more and more mature and so is the version 2.8 more an evolution than a revolution.

Bill and Dick showed some enhancements when it comes to productivity. There are stronger tools now for coding pluggable into the various IDE platforms like IntelliJ, Eclipse and Netbeans like better highlighting and cross code completion. The latter works also from the REPL (scala’s command line interpreter).

Then they showed some coding in real time to demonstrate scala enhancements not without a lot of typos which they fixed at last. At least he audiance had fun with that ;-)

Here is an example which I liked:

case class Mnemonic(e: String,g: String, b: String, d: String, f: String)

Mnemonic = Mnemonic(e="every",g="good", b="boy",d="does", f="fine")

which is equivalent to

Mnemonic = Mnemonic(e="every",g="good", b="boy", f="fine", d="does")

Note that you can swap the parameters any way you want. Default values for parameters are now also supported and so it is possible just to set the parameters which differ from the default.

Another example they demonstrated is the package notation which now can be declared like in C# (the Java way is still possible)

package org {
    package scalatest {
    //here we are
    }
}

They finished mentioning a new language feature called “Continuations”. People who worked with the Smalltalk web server project “Seaside” are already familiar with this. This is usefull if you want to rest a thread and then come back after a while to continue this thread just where it stoppped. This is useful for save web states for example and if you want the web application behave like a “normal” programm when pressing Back and Forward as in Seaside

Testing RESTful WebServices (Jan Algermissen)

A contract is important for testing WebServices. In SOAP WebServices you have WSDL, WADL, XSD for defining contracts between client and server. In RESTful WebServices things are different. Service specific contracts are not usual in REST environments. The speaker Jan Algermissen showed his view of testing RESTful WebServices and tries to give an idea on how to accomplish automated regression testing. To be honest his approach is very academic and is lacking of practical thoughts but more on this in my summary later.

First of all there are three layers of expectation in a RESTful WebService test:

  • Application layer (Human intention of what a RESTful WebService delivers as a response)
  • Resources (Which resources are accessible)
  • Message (Syntax, Semantics)

The speaker concentrated on the resource layer. In his point of view message syntax (Http) is not likely to cause problems in the first place. I do not agree with that. Custom Http header information as well as content types do count for me as a tester.

In detail a RESTful WebService has following characteristics and properties that are testable:

  • Link semantics matching the resource: A <img src=“images/dog.jpg”/> resource indicates to an image
  • Resource availability: Resources are expected to be available (no 404 Not found responses)
  • Resource semantics are stable over time: Resource deliveries do not change in semantics from one client call to another
  • Variants remain stable: Request Accept: application/order matches Response Content/type: application/order

After this short introduction the speaker tires to give an approach on how to accomplish tests for all these aspects. Basically his approach was to start with a static set of resource URIs defined by the tester. The test framework should start with this set and use a crawling mechanism to gather further information (resource semantics, variants, new resources, message syntax, and more.

Tests may fail in the above described fields link semantics, resource availability, resource semantics and variants. The gathered information as well as the test results go into a relational database for further regression testing.

Unfortunately the speaker did not come with practical thoughts and examples of how to do this testing approach. In general my opinion is that testing towards a clean run of data collection is not the best way to do testing. Of course tests grow fast in quantity but do they fit well in quality? What to do with code changes where your reference test data becomes invalid and how to separate those failing tests from real problems coming from the application? Do we have to run the clean data crawling again from scratch? Does the clean run data crawling mechanism really reflect the desired and expected behavior? I doubt that this testing approach will handle the software development process with changing code and changing expectations throughout time. Unfortunately the speaker did only give an academic idea on how to test RESTful WebServices, but was lacking of answering those questions coming from practical thoughts on automated testing.

Vaadin - (Joonas Lehtinen)

Vaadin is a java framework for developing rich web applications. It is built on top of Google’s GWT and promises to allow java developers to focus on writing java code, hiding all the javascript, html and CSS implementation details. Programming is very similar to traditional desktop programming, using events and listeners rather than requests and responses.

So why use vaadin?

Yet another java web framework? Yes, but this one stands out from the others in my opinion. The client side code (javascript, html) is generated auto-magically by Vaadin (using GWT under the hood). The developer does not have to spend time debugging javascript incompatibilities between different browsers or synchronising state between the client and server models and can instead focus on the implementation logic. The low level implementation details - for example AJAX or data binding between widgets, is hidden from the developer. On top of that the online documentation is plentyful, an free e-book is available for getting up to speed quickly and there is lots of sample code available on how to use the individual widgets. Custom widgets developed by the active vaadin community are made available (via MVN) for other vaadin users.

On the down side Vaadin applications are not as scalable as traditional web applications. The memory footprint is higher due to all client state being stored on the server side. That said performance bottlenecks will occur in lower down layers long before they are present within the GUI. Support for RTL (right-to-left) languages is unfortunately missing and is not on the roadmap for future releases of Vaadin (yet).

My overall impression of the talk was very good. Joonas created a demo web application from scratch, demonstrating how easy and productive it is to develop with vaadin.