Knowing the Purpose, what’s marshaling and data representation

Isuru Dhananjaya Ranaweera
7 min readJun 20, 2020

--

The external data representation and marshaling are very important factors in data processing. In the distributed architecture,Generally it is need to connect end users and the servers.According to the relevant scenario it is need have proper data representation and transforming it into a standard format before it is transmitted over a network.When comes to these topics, there are some technique will be used.

Approach to Marshaling & Data representation

According the theory of Heterogeneity, in the communication we have to think about operating systems, hardware architecture, web services, programming languages and so on. Main point is they different one to another in distributed systems.

As an example when programming we use data structures to data processing. But when its moving through TCP/UDP level it is needed to covert like stream of bytes. Convention is needed for this kind of scenarios.

  • Interpretation of a byte (big or little endian)
  • Representation of integers, floating point values, characters
  • ASCII and Unicode, etc.

In order to avoid this type obstacles,

  • The values can be converted to an agreed external format before
    transmission and converted to the local form on receipt.
  • The values are transmitted in the sender’s format together with
    an indication of the format used, and the recipient converts the
    values if necessary.

An agreed standard for the representation of data structures and
primitive values is called an external data representation
Marshaling is the process of taking a collection of data items and
assembling them into a form suitable for transmission.

As an instances we can use CORBA’s common data representation,Java’s object serialization,XML (Extensible Markup Language) and so on.

CORBA’s common data representation

CORBA, known as CDR follow the primitive types and constructed types.

Primitive and Constructed types

It enables clients and servers written in different programming languages to work together. For example, it translates little-endian to big-endian. Assumes prior agreement on type, so no information is given with data representation in messages.Using this both ways ,each argument or result in a remote invocation is represented by a sequence of bytes in the invocation or result message.Sample representation is as follows

Source : http://www.brainkart.com/

The main components of the CORBA architecture

The CORBA architecture is designed to support the role of an object request broker that enables clients to invoke methods in remote objects, where both clients and servers can be implemented in a variety of programming languages. CORBA provides for both static and dynamic invocations. Static invocations are used when the remote interface of the CORBA object is known at compile time,enabling client stubs and server skeletons to be used. If the remote interface is not known at compile time, dynamic invocation must be used. Most programmers prefer to use static invocation because it provides a more natural programming model. The brief introduction as follows.

ORB core: The role of ORB core act as an interface.it helps operations enabling it to be started and stopped,operations to convert between remote object references and strings and operations to provide argument lists for requests using dynamic invocation.

Object adapter:The role of an object adapter is to bridge the gap between CORBA objects with IDL interfaces and the programming language interfaces of the corresponding servant classes.

Skeletons:Skeleton classes are generated in the language of the server by an IDL compiler.

Client stubs/proxies:These are in the client language. The class of a proxy (for object oriented languages) or a set of stub procedures (for procedural languages) is generated from an IDL interface by an IDL compiler for the client language.

Implementation repository:An implementation repository is responsible for activating registered servers on demand and for locating servers that are currently running.

Interface repository:The role of the interface repository is to provide information about registered IDL interfaces to clients and servers that require it. For an interface of a given type it can supply the names of the methods and, for each method, the names and types of the arguments and exceptions.

Java’s object serialization

In Java has a strategy, called object serialization where an object can be represented as a sequence of bytes that includes the object’s data as well as information about the object’s type and the types of data stored in the object.After a serialized object has been written into a file, it can be read from the file and deserialized that is, the type information and bytes that represent the object and its data can be used to recreate the object in memory.

The best use case is the JVM independent factor. Therefore an object can be serialized on one platform and deserialized on an entirely different platform.

Classes ObjectInputStream and ObjectOutputStream are high-level streams that contain the methods for serializing and deserializing an object.

To make use of Java serialization:

  • To serialize: create an instance of ObjectOutputStream
  • Invoke write Object method passing Person object as argument
  • To deserialize: create an instance of ObjectInputStream
  • Invoke read Object method to reconstruct the original object
ObjectOutputStream out = new ObjectOutputStream(… );
out.writeObject(originalPerson);
ObjectInputStream in = new ObjectInputStream(…);
Person thePerson = in.readObject();

The basic order is..

  1. its class info is written out: name, version number
  2. types and names of instance variables
  • If an instance variable belong to a new class, then new class info must be
    written out, recursively
  • Each class is given a handle

3. values of instance variables

Example as follows:

Person p = new Person(“Smith”, “London”, 1934);

XML (Extensible Markup Language)

Example of XML

The XML standard is a flexible way to create information formats and electronically share structured data via the public Internet, as well as via corporate networks.XML code, a formal recommendation from the World Wide Web Consortium (W3C), is similar to Hypertext Markup Language (HTML). Both XML and HTML contain markup symbols to describe page or file contents. HTML code describes Web page content (mainly text and graphic images) only in terms of how it is to be displayed and interacted with.

XML use cases

  • XML can work behind the scene to simplify the creation of HTML documents for large web sites.
  • XML can be used to exchange the information between organizations and systems.
  • XML can be used for offloading and reloading of databases.
  • XML can be used to store and arrange the data, which can customize your data handling needs.
  • XML can easily be merged with style sheets to create almost any desired output.
  • Virtually, any type of data can be expressed as an XML document.

According the above use cases, XML use with language base strategies. As an example

JAXB: It is stands for Java Architecture for XML Binding. It provides a mechanism to write Java objects into XML and read XML as objects. Simply put, you can say it is used to convert Java objects into XML and vice-versa.

JAXB provides a fast and convenient way to bind XML schema and Java representations, making it easy for Java developers to incorporate XML data and processing functions in Java applications. As part of this process, JAXB provides methods for unmarshalling (reading) XML instance documents into Java content, and then marshalling (writing) Java content back into XML instance documents. JAXB also provides a way to generate XML schema from Java objects.

Conclusion

Considering with CORBA’s,Java’s object serialization and XML we can summarized main points. CORBA’s and Java’s object serialization, when doing the marshalling and unmarshalling activities are intended to be
carried out by a middleware layer without any involvement on the part of the application programmer.

Considering the XML, It is textual and therefore more accessible to hand-encoding, software for marshalling and unmarshalling is available for all commonly used platforms and programming environments.Because marshalling requires the consideration of all the finest details of the representation of the primitive components of composite objects, the process is likely to be error-prone if carried out by hand. Compactness is another issue that can be addressed in the design of automatically generated marshalling procedures.

In the first two approaches, the primitive data types are marshalled into a binary form. In the third approach (XML), the primitive data types are represented textually. The textual representation of a data value will generally be longer than the equivalent binary representation.

--

--

Isuru Dhananjaya Ranaweera
Isuru Dhananjaya Ranaweera

Written by Isuru Dhananjaya Ranaweera

BSc. (Hons.) in Software Engineering | University of Kelaniya, Sri Lanka| https://github.com/Isuru40 | https://isururanaweera.me

No responses yet