Tutorials

A fun, interactive comparison of programming language verbosity

by

Thomas Roca

6 minutes read

If you want to start a flame war among developers, all you need to do is start a discussion about what programming language is “best”.

I wanted to do a twist on that, starting with something less subjective. I’m simply curious about which languages are most or least verbose?

A starting point could be to assess their “conciseness” in performing various tasks (and not make any judgment as to any language’s efficiency or effectiveness).

This leads me to Rosettacode.org, which is an awesome source of information for any programming enthusiast. It offers various tasks (more than 870) and code snippets to solve them, in many programming languages (more than 680).

I always wanted to explore Rosettacode.org and compare the different programming languages. It can help discover new ways of addressing the problems you face in your “native” language, and identify alternative ways of thinking. To that end, I wrote a little app to facilitate this process, which I will share that with you here.

The app, is very basic. It simply compares the length of code snippets from Rosettacode.org for different tasks and languages, and displays the result on a bar chart using my favorite Javascript library, Highcharts.

Python, Flask, Highcharts, and Azure (my playground) Environment

Azure is Microsoft’s cloud (disclaimer, I work for Microsoft). It is a great infrastructure to host your data, create bots and automate tasks, deploy Python apps or leverage AI.
For this demo, I installed a python 3.4 environment on Azure and a few libraries (Beautiful Soup and Flask)

Steps 1: Scraping

The first step is to scrape RosettaCode to get all available tasks. I wrote a very short python script to do the scraping. I pre-selected a subset of programming languages (see languages_dict below) in order to avoid scraping too much irrelevant data.

# Selected language names and corresponding spelling/encoding as in HTML page 
languages_dict={"Java":"Java","JavaScript":"JavaScript","C":"C","C.2B.2B":"C++","C.23":"C#","COBOL":"Cobol","Haskell":"Haskell","Python":"Python","R":"R","Julia":"Julia","MATLAB_.2F_Octave":"Matlab","Pascal":"Pascal","Fortran":"Fortran","BASIC":"BASIC","Go":"Go","Ruby":"Ruby","SAS":"SAS","Stata":"Stata","Swift":"Swift","Processing":"Processing","UNIX_Shell":"UNIX Shell","VBA":"VBA","PowerShell":"PowerShell"}

#store language names in an array
language_name=[]

#populate the array from the dict
for item in languages_dict:language_name.append(languages_dict[item])

#get all tasks from Rosettacode.org
url_task="http://www.rosettacode.org/wiki/Category:Programming_Tasks"
r = requests.get(url_task)
soup = BeautifulSoup(r.text, 'html.parser')
table=soup.find("div", {"class": "mw-category"})

#Create an empty dictionary to be filled with the tasks as they appear in the HTML source and name (as header)
url_dict={}

#get all links (a tags) 
tags=table('a')

#iterate over tag list and fill the task/url dictionary
for tag in tags:
    url_dict[tag.get('title',None)]=tag.get('href',None)[6:]

#store task names in an array
task_name=[]
for item in url_dict: task_name.append(item)
array_language=[]
count=[]
task=""

#flask method to get the task the user wants to compare
if request.method == "POST":
    # get url from task the user requested
    task = request.form['task']
    url="https://www.rosettacode.org/wiki/"+url_dict[task]
    r = requests.get(url)
    soup = BeautifulSoup(r.text, 'html.parser')
    dict_count={}
    for language in languages_dict:
        try: 
            header=soup.find("span", {"id": language})
            snippet=BeautifulSoup(header.find_next("pre").text, 'html.parser')
            dict_count[languages_dict[language]] = len(snippet.text)
        except: 
            continue

    #sort dictionary
    for lang in sorted(dict_count, key=dict_count.get, reverse=True):
        array_language.append(lang)
        count.append(dict_count[lang])

When the user selects the task to compare, the script above looks for the pre-selected languages and evaluates the length of the corresponding snippets (within the “pre” tag after the corresponding header within the HTML). The script stores the results in an array which will be sent to the frontend via Flask to my Highcharts.

Here are the resulting arrays for the snippets corresponding to a “For Loop”:

# array_language
['C', 'Swift', 'Pascal', 'Go', 'BASIC', 'Haskell', 'UNIX Shell', 'Matlab', 'Processing', 'PowerShell', 'Fortran', 'Cobol', 'VBA', 'C++', 'Java', 'Julia', 'Ruby', 'JavaScript', 'Python', 'R', 'SAS', 'Stata', 'C#']

#count (length of the snippets)
[90, 73, 144, 167, 70, 115, 400, 103, 103, 116, 1211, 533, 193, 104, 119, 71, 55, 114, 92, 85, 293, 79, 261]

These arrays will then feed the bar chart.

Step 2: Highcharts

Python and Flask send the arrays of data to Highcharts using {{ array_language | safe}} and {{count}}. During the HTML page rendering, the server injects the arrays constructed with python. Notice that I use the | safe option to avoid any encoding surprise.

Highcharts.chart('container', {
    chart: {
        type: 'bar'
    },
    title: {
        text: 'Programming language comparison: {{task | safe}} '
    },
    subtitle: {
        text: 'Source: rosettacode.org'
    },
    xAxis: {
        categories: {{ array_language   | safe}},
        title: {
            text: null
        }
    },
    yAxis: {
        min: 0,
        title: {
            text: 'Task size (char)',
            align: 'high'
        },
        labels: {
            overflow: 'justify'
        }
    },
    tooltip: {
        valueSuffix: 'chars'
    },
    plotOptions: {
        bar: {
            dataLabels: {
                enabled: true
            }
        }
    },

    series: [{
        name: '{{task}}',
        data: {{count}}
    }]
});

Results

Below is a screenshot of the application.
Here are the Python script and HTML page hosted on Azure.

We now have an interactive way to compare the length of the code we need to perform many tasks! A nice feature to be added to the chart would be to display the code snippet as a tooltip over each language. This would enhance our visual exploration of Rosettacode.

Many of you might be thinking that this is a rather silly exercise, and may even claim that it is faster for you to write more code in your favorite language than less code in a different language. You are right. This little exercise tells you nothing other than how verbose different languages are when performing similar tasks. Fun for the language nerds out there, right? 🙂

If we really wanted to review code-level execution-time efficiency on a task-by-task basis, it would be really fun to take this one step further and develop some method for measuring execution time and resources for each language. We could then extend this demo with a fancy dashboard for code efficiency (with lots of caveats again, of course…).

I had a lot of fun to set up this demo, feel free to share your experience or questions in the comment section below.

I’ll let Dilbert have the last word, though.

Source: Dilbert