David Henty

I have been working with supercomputers for over 25 years, and teaching people how to use them for almost as long. I joined EPCC after doing research in computational theoretical physics.

Location EPCC, The University of Edinburgh, Scotland, UK.

Activity

David Henty replied to Betty Kibirige

OpenMP and threads

08 NOV 2019

Yes - the OS is constantly juggling dozens of different processes and threads, trying to ensure they all get their fair share of CPU time. The threads that OpenMP creates are just thrown into the mix with all the others. For HPC applications we usually make sure that a minimum of other tasks are running so the OpenMP threads will run almost continuously on the...
David Henty replied to Tom Couch

Case study of a real machine

01 NOV 2019

It is possible to reuse the heat but, until recently, the outlet water was not hot enough to be of much use. However, modern machines run much hotter which makes the heat carried away but the water much easier to use e.g. in heating other buildings - see "Energy Efficiency by Warm Water cooling" at...
David Henty replied to Istvan F

MARCONI: building a real supercomputer

01 NOV 2019

@IstvanF The layout of all the cabinets is typically fixed to minimise cable lengths. Connecting all the cables is a huge job and normally done by dedicated experts.
David Henty replied to Ruben Cardenas Sanchez

Scaling and Parallel Overheads

01 NOV 2019

That is a very good point - for large simulations on supercomputers, the limiting factor (the slowest part) is usually reading and writing memory and not the clock speed of the CPUs.
David Henty replied to Kyron Hodgetts

Traffic model: predictions, implementation and cellular automata

01 NOV 2019

Yes - in a typical cellular automaton model you need to know the state of all the neighbouring cells. In 1D this is 2 neighbours (left and right), 2D is 4 neighbours (up and down as well), 3D is 6 neighbours ... In general, it's 2xD neighbours for D dimensions. If you include diagonals then the numbers of neighbours for 1D, 2D and 3D are 2, 8 and 26. In...
David Henty replied to Aurelio Vivas

Supercomputers - why do we need them?

01 NOV 2019

There are a number of parallel packages that can do Molecular Dynamics on parallel supercomputers, e.g. NAMD, GROMACS, LAMMPS, AMBER, ... EPCC recently ran an online LAMMPS tutorial - see https://www.epcc.ed.ac.uk/blog/2019/online-lammps-training-archer
David Henty replied to Paul V

Example 3: Rubbernecking

01 NOV 2019

See https://www.youtube.com/watch?v=7wm-pZp_mi0
David Henty replied to Andrew Matthew

Understanding Supercomputing - Processors

01 NOV 2019

@AndrewMatthew Up until the early 2000's, each manufacturer had their own version of Unix, e.g. Unicos (Cray), Tru64 (DEC/Compaq), Irix (SGI), Solaris (Sun), AIX (IBM), ... The advantages were that each OS was tailored for a particular architecture, but the development cost of maintaining their own OS was too much for most companies so they gradually moved to...
David Henty replied to Istvan F

Current Trends and Moore's Law

01 NOV 2019

You can argue that more powerful CPUs enable software to be written more easily as you can concentrate on functionality and elegance rather than having to worry about performance (since a fast CPU can still run less efficient software at an acceptable speed). Another view is that fast CPUs just encourage poorly written, bloated software!
David Henty replied to Shoopala Nambahu

Current Trends and Moore's Law

01 NOV 2019

Power consumption and heat are real issues for mobile devices - you want to maxmise battery life and, as you point out, they are not well designed for getting rid of heat. This is why multicore technology is so attractive even if it makes the software more complicated - two cores each running at 1GHz use less power than one core running at 2GHz.
David Henty replied to Istvan F

Message-Passing Model

01 NOV 2019

In practice, different cores will all be running at different speeds. Modern CPUs vary clock frequency dynamically based on load (e.g. turn it down if the processor is getting hot, crank it up if there aren't that many cores running and there is spare power). Even if they operated at the same clock speed, they would run at very different speeds in practice as...
David Henty replied to Suliman Banora

Traffic model: predictions, implementation and cellular automata

25 OCT 2019

The Game of Life is a very good example in terms of parallelising a real program. In practice, the strategy is identical to the traffic model - at each step, you update each cell based on the state of its nearest neighbours. In the 1D traffic model that just comprised the cells up and down in the road. For the 2D Game of Life, it's the eight nearest neigbours...
David Henty replied to Suliman Banora

Comparing the Message-passing and Shared-Variables models

25 OCT 2019

That's exactly correct - Message Passing is harder to implement, but less prone to subtle bugs. Most importantly for supercomputing, it is the only way to run on multiple nodes as Shared Memory is limited to a single node. Although this is a fine way to use all the cores on your laptop, on ARCHER this would limit you to running on only 24 cores of the total...
David Henty replied to Kyron Hodgetts

How does your laptop use multiple CPU-cores?

25 OCT 2019

Virtualisation / containerisation is becoming more common in Supercomputing as it allows to develop on a local system (e.g. your laptop) and deploy in a larger machine (e.g. ARCHER). However, this can cause significant slowdowns for parallel programs. The whole point of virtualisation is to insulate operating systems from each other and from the hardware. In a...
David Henty replied to Alex Wardle

Case study of a real machine

25 OCT 2019

That's correct - fans blow air over the blades, so it's cool air in and hot air out. The air is then cooled by large chillers which transfer that heat from the air to water, and at the ACF we can normally cool the water back down again using "ambient cooling" since the weather in Scotland is not normally very hot! See...
David Henty replied to Andrew Matthew

Running Weather Simulations

25 OCT 2019

I commented on a similar point someone made in a different step and I think it's relevant here too:

"This was tried in the early days of parallel computing and was called "metacomputing" - a single program running across separate computers distributed all over the globe. The problems are reliability (one of the machines could crash) and speed (it takes a...
David Henty replied to Bernat Molero

Shared memory vs Distributed memory

23 OCT 2019

This was tried in the early days of parallel computing and was called "metacomputing" - a single program running across separate computers distributed all over the globe. The problems are reliability (one of the machines could crash) and speed (it takes a long time for a computer in Europe to communicate with one in Japan). However, the model is used in...
David Henty replied to Istvan F

Useful links

23 OCT 2019

My understanding is that it is using the appropriate precision for storing floating-point numbers rather than always using the highest precision available. For example, at the start of a calculation (where you may be a long way form the correct answer) there may be no need to use double-precision numbers - maybe single precision is enough. Later on, as you're...
David Henty replied to Bernat Molero

Understanding Supercomputing - Processors

16 OCT 2019

@DavidFischak I first started working in HPC back in 1990 and you're right that there was a lot more diversity in the market: lots of competing processors and different flavours of Unix from numerous manufacturers. This changed and for quite some time we've had an almost complete monopoly of Intel x86 CPUs and Linux. However, things are changing again and, as...
David Henty replied to Bernat Molero

Supercomputers - why do we need them?

16 OCT 2019

@BernatMolero Monte Carlo simulation typically refers to any computation where random numbers are used. For example, if I wanted to simulate people evacuating from a building then I might use lots of random numbers to decide if someone turns left or right at the end of a corridor on their way out. This leads to lots of different simulations where people take...
David Henty replied to Andrew Matthew

Supercomputers - what are they?

16 OCT 2019

Production-line manufacturing is an very good analogy. As you point out, there is parallelism within a single production line (e.g. different workers build different sections of a car as it passes down the line). The amount of parallelism might be limited, e.g. if there are 20 steps then you can't make use of more than 20 workers. The solution, as you've...
David Henty made a comment

Introductions

16 OCT 2019

Hi - I'm David Henty and I work at EPCC at the University of Edinburgh, Scotland, UK. I co-developed the MOOC with Weronika and colleagues from SURFsara in the Netherlands.
David Henty replied to Renaud Vernet

Current Trends and Moore's Law

25 SEP 2018

As Jane points out, the number one machine has a performance profile that isn't necessarily representative of the majority of the world's supercomputers. However, another factor is that Moore's law is relevant for the performance of a single CPU. A supercomputer has many thousands of CPUs, so the total performance can outstrip Moore's law if we also increase...
David Henty replied to Catherine Yorkshire

Communication Costs

05 MAR 2018

Exactly - even a "null" message actually contains data such as the headers so they do clog up the network.
David Henty replied to Graham Brown

Resource Contention

05 MAR 2018

We could, but I think the issue has always been that processor speeds have increased more rapidly than memory systems so we're fighting a losing battle.
David Henty replied to Graham Brown

What limits the speed of a supercomputer?

05 MAR 2018

A very good point! Over the years, computing has swung between "think client" models like your "dumb terminal" example (processing done remotely) and "thick client" models like powerful desktops (processing done locally). We seem to be in a "thin client" phase where many of our devices are just used as access points for remote processing systems such as...
David Henty replied to Clint Childs

Wolf-sheep predation simulation - Initial and Boundary Conditions

05 MAR 2018

These cycles are observed in real predator / prey data, see e.g. https://theglyptodon.wordpress.com/2011/05/02/the-fur-trades-records/
David Henty replied to Tony McCafferty

Benefits of Hybrid Programming

26 FEB 2018

@TonyMcCafferty I don't know if it's exactly what you were thinking of, but people do something called "autotuning" to optimise performance. If there are lots of possible parameters to adjust for a computation, you can simply run thousands of copies with different settings and find out experimentally what the best settings are. This takes huge amounts of...
David Henty replied to Stephen Marsh

Graphics Processors

26 FEB 2018

If the batch system is doing a good job then the system should be reasonably full up all the time. People do build machines specifically to mine bitcoins, but it wouldn't be a cost-effective use of a supercomputer as you would not be using the capabilities of the high performance network.
David Henty replied to Bart Wauters

Ask an expert

22 FEB 2018

Thanks for putting that link in!
David Henty replied to Fumi I

Graphics Processors

21 FEB 2018

@FrancescoMaroso You're correct that we could have had a GPU portion. However, we might have effectively ended up with two smaller systems - one with GPUs and one with CPUs - rather than one large system. The main focus of ARCHER was to enable very large simulations that could not be done on any other academic system in the UK so the decision was to have the...
David Henty replied to Tony McCafferty

Benefits of Hybrid Programming

21 FEB 2018

F1 designers definitely use supercomputers to model their cars. However, to ensure a level playing field between teams, the amount of computer time they can use is severly limited e.g. I found this discussion on an F1 fan site: https://www.f1technical.net/forum/viewtopic.php?t=13311
David Henty replied to Simon Hennessey

Ask an expert

21 FEB 2018

@HarryTerkanian We have a few simple parallel programs written in MPI plus C or Fortran that we use on training courses - see for example the exercise material at http://www.archer.ac.uk/training/course-material/2017/12/intro-ati/index.php - which cover image processing, fluid dynamics and fractals. These should be relatively easy to port to a Raspberry Pi...
David Henty replied to Simon Hennessey

Ask an expert

21 FEB 2018

@SimonHennessey We have a few simple parallel programs written in MPI plus C or Fortran that we use on training courses - see for example the exercise material at http://www.archer.ac.uk/training/course-material/2017/12/intro-ati/index.php - which cover image processing, fluid dynamics and fractals. These should be relatively easy to port to a Raspberry Pi...
David Henty replied to Jason Polyik

Ask an expert

21 FEB 2018

People have been looking at using FPGAs for HPC several years. Despite the potential for very good performance compared to power consumption, the problem has generally been programming them. It is very difficult to get good performance from large, numerically intensive programs written in C, C++ or Fortran.
David Henty replied to Tony McCafferty

Benefits of Hybrid Programming

18 FEB 2018

@GillianC That's a good point - if a problem has a very complicated geometry such as if you wanted to simulate the air flow round an entire car then it is not easy to split the calculation up into equal-sized chunks. In situations like this then the approach is exactly as you describe - an important part of the pre-processing stage is "mesh partitioning" where...
David Henty replied to Gillian C

Parallel Performance

18 FEB 2018

@HarryTerkanian As ever, problems in computing have very good analogies in everyday life and "The Mythical Man Month" is an excellent analogy to the problem of just throwing more CPU-cores at a calculation. The real killer is that as you add more CPU-cores, each core is working on a smaller piece of the problem and the overhead of communication becomes greater.
David Henty replied to Chris Cussen

Week 1 summary

18 FEB 2018

The problem is to do with power consumption and heat production. Although we could produce a CPU with twice the speed, it would be so power hungry that it would be too expensive to run. It would also not be suitable for consumer devices as you would need expensive additional cooling to stop it overheating - your laptop can only really accommodate a small fan....
David Henty replied to Stephen Marsh

How similar is your laptop to a node of a supercomputer?

18 FEB 2018

That's a very good point - on ARCHER the nodes are packaged so that there are four on a physical "blade". This means that these four nodes can actually communicate with each much more quickly than with nodes on a different blade.
David Henty replied to stan chell

Quantum Computing

18 FEB 2018

I'm glad you found them useful - we significantly expanded the "Towards the Future" section after the first run last year as it was clearly an area that people were interested in.
David Henty replied to Stephen Marsh

ARCHER - it's more complicated...

18 FEB 2018

That's correct, but it's important to note that this comes from the use of accelerators (in the case of Piz Daint, NVIDIA GPUs) rather than traditional multicore CPUs. Since GPUs have a very different architecture to CPUs, it's not immediately clear how many "cores" a GPU has, but the top500 list appears to count the number of "Streaming Multiprocessors". The...
David Henty replied to Stephen Marsh

Wee ARCHIE case study

16 FEB 2018

That's an interesting observation, but in supercomputer networks it turns out that the major overhead is getting the data onto and off of the network infrastructure. Once data is on the network it travels very fast, so the cable length doesn't have such a big effect on the end-to-end transfer time.
David Henty replied to Tony McCafferty

Artificial Intelligence

16 FEB 2018

I was always sceptical about whether driverless cars would take off as, even if they reduce risks at a statistical level (i.e. fewer accidents across thousands of drivers) an individual driver will always think that they would have done better than the robot in each particular accident. However, I read an article that made the point that for driving there is a...
David Henty replied to Tony McCafferty

So why are supercomputers needed?

16 FEB 2018

My understanding is that the complexity comes from simulating two materials of very different viscosities at the same time - oil is very thick and gas is very "runny" in the sense that it flows very easily. I'll see if I can find a more definitive answer ...
David Henty replied to Stephen Marsh

Case study of a real machine

16 FEB 2018

I don't think hard-wiring the OS would be a good idea as any errors could never be fixed, e..g you could not patch the system when yet another security hole was discovered! I have talked about caches in terms of data, but in fact instructions are also cached so the performance of the operating system is usually very good as all commonly executed pieces will...
David Henty replied to Peter Rogers

Computer Basics

16 FEB 2018

Although individual packets of data may be retransmitted, if there is a serious network failure then it will typically bring the whole system down. We spend lots of money on supercomputer networking for both speed and reliability. If you are doing calculations across widely distributed computers, such as done by Amazon and Google, you build resilience into the...
David Henty replied to Sandra Passchier

Ask an expert

16 FEB 2018

@SandraPasschier It depends. On ARCHER, you do your visualisation on a separate (smaller) system called the Data Analytic Cluster, although it is connected to the same disk storage as ARCHER so you don't have to copy your data around. If the visualisation is very computationally expensive, or needs such huge amounts of data that you can't afford to write it...
David Henty replied to Shawn Sendlinger

Future of Supercomputing - the Exascale

16 FEB 2018

My understanding is that TPUs are designed for very fast calculation but at low precision. This is OK for many artificial intelligence applications but probably not OK for traditional computer simulations - I touched on this a bit in a previous answer https://www.futurelearn.com/courses/supercomputing/3/comments/25575992
David Henty replied to Harry Terkanian

Shared Memory Architecture

16 FEB 2018

I didn't notice you'd already answered Anton's question before I posted my own answer in https://www.futurelearn.com/courses/supercomputing/3/comments/25988184
David Henty replied to Tony McCafferty

What's next?

16 FEB 2018

@TonyMcCafferty A very good point - log graphs can be deceptive and hide the enormous increase in the data values by collapsing them together. We touch briefly on quantum computing here, which some believe is the next step.
David Henty replied to Nicola Richman

Traffic model: predictions, implementation and cellular automata

16 FEB 2018

Having periodic boundary conditions in our one-dimensional traffic model is the same thing as using a circle (i.e. a roundabout) rather than a line (a straight road). As you point out, in two dimensions, periodic boundaries in both dimensions gives you the topology of a torus (i.e. a doughnut) where you come back to where you start if you head off either the...
David Henty replied to Jason Polyik

NEURON simulation environment

16 FEB 2018

This is an interesting point. However, too many CPU cores has two downsides for supercomputing. First, it can overload the memory bandwidth as we saw in Week 2. Second, all CPU-cores on a node typically share a single network connection which means the communications can slow down. This is why in typical business and banking applications you see machines with...
David Henty replied to Simon Hennessey

Running Weather Simulations

16 FEB 2018

I guess that modern satellites have improved the situation for some measurements. Interestingly, however, you can use simulation to cope with sparse experimental data. Imagine we know the weather today buy only on a sparse grid (say 20 miles spacing). Let's start with yesterday's sparse experimental data and *guess* what the real data was on a 100 yard grid...
David Henty replied to Anton Bradburn

Shared Memory Architecture

16 FEB 2018

That's exactly the point - at some point the limiting factor is not the power of the CPU cores but their ability to access the memory. The table at the bottom of https://www.futurelearn.com/courses/supercomputing/3/steps/260548 illustrates this to some extent. Rather than adding more physical CPU-cores to a processor (which I'm not remotely qualified to do!) I...
David Henty replied to Fumi I

Graphics Processors

16 FEB 2018

@FrancescoMaroso The peak performance would have increased significantly. Each ARCHER node is around 0.5 Tflop (2 x 250 GFlop CPUs). At the time of installation, a GPU might have been around 1 TFlop peak so a CPU+GPU node would have had well over twice the pure CPU peak. However, for a national system like ARCHER, you need to look at the spread of applications...
David Henty replied to Simon Hennessey

Computational Science

11 FEB 2018

Fortran is still commonly used in computational science because it is a language designed specifically for scientific and technical computing. However, the stats on ARCHER are slightly misleading. Many people run centrally installed packages - although a larger fraction of the CPU cycles are used running programs that are written in Fortran, this is because...
David Henty replied to Francesco Maroso

Resource Contention

11 FEB 2018

A very good question! From a user point of view, all the cores are basically the same. I have heard it said that certain core Linux operating system services are designed to run on the first core (which would be number zero) but I can't find any references to this ...
David Henty replied to Katrina W

Resource Contention

11 FEB 2018

In case we didn't explain it clearly enough, it's not a question of the memory being "big-enough" - it's a question of whether it is possible for all CPU-cores to access the memory with sufficient speed. The answer is no - if all CPU-cores try to access main memory at the same time then they slow each other down. It's a bit like the road network - you would...
David Henty replied to Stephen Marsh

Memory Caches

11 FEB 2018

EPCC is involved in a project looking at the application of cross-point memory for supercomputing - see http://www.nextgenio.eu/. One focus is using it as a faster alternative to disk - the project is looking at the "new 3D XPoint non-volatile memory, which will sit between conventional memory and disk storage.".
David Henty replied to stan chell

Parallel Performance Laws

11 FEB 2018

This illustrates one of the challenges of writing an efficient parallel program. By making some parts of the calculation very fast, other parts start to be the limiting factor and you have to start addressing them as well. So, a 2-hour check-in may not be particularly significant for a normal airplane but for Concorde it has a significant effect on the...
David Henty replied to Gillian C

Ask an expert

11 FEB 2018

@GillianC There is a list of projects like this on Wikipedia - see https://en.wikipedia.org/wiki/List_of_distributed_computing_projects
David Henty replied to Leon Okoni

Who needs a multicore laptop?

11 FEB 2018

That's correct - to run a program on multiple cores requires that the program is capable of being parallelised, and that a programmer has implemented the parallelism. However, remember that you can still use multiple cores by running several different serial programs at the same time. In this situation the operating system can automatically take advantage of...
David Henty replied to Stephen Marsh

Who needs a multicore laptop?

06 FEB 2018

That's the key point - to run a single program on multiple cores requires a parallel algorithm. The operating system can automatically keep all the cores busy if there is a large number of serial applications to run at the same time, but parallelising an application currently needs human expertise.
David Henty replied to Leon Okoni

Shared Memory Architecture

06 FEB 2018

You're right that High Bandwith Memory is a very important development for supercomputing - since we're typically limited by memory bandwidth, anything that increases it is going to have a dramatic impact on performance.
David Henty replied to Stephen Marsh

Shared Memory Architecture

06 FEB 2018

That's exactly the point I was trying to make - having two laptops is often just inconvenient. As a family grows, the parents would probably buy a single larger car rather than an additional small one.
David Henty replied to Gillian C

OpenMP and threads

06 FEB 2018

The nice thing about learning OpenMP is that it is supported by almost all modern compilers (e.g. C, C++ and Fortran) and you can learn it by just using a standard multicore laptop. For technical reasons you can't use OpenMP from Python (it's an interpreted language and can't really cope with threads) although you can do message-passing with MPI via mpi4py.
David Henty replied to Nicola Richman

Wee ARCHIE case study

06 FEB 2018

That's a very good point. The way to mitigate this is for the scheduler to try and allocate nodes that are close together in the network, e.g. for ARCHER always try and give a job nodes that are in the same cabinet as inter-cabinet communications is more costly.
David Henty replied to Clive Openshaw

MPI and processes

06 FEB 2018

Scheduling is normally done automatically and not by a human operator. However, visualisation can be very useful to understand what the scheduling algorithm is doing. If you think things are going wrong, e.g. you suspect the compute nodes are not being efficiently used, then visualisation can make what is going on much clearer. We had a summer student look at...
David Henty replied to Bart Wauters

Scaling and Parallel Overheads

06 FEB 2018

Point 1) is a very interesting one. The traffic model is a useful example here. No matter how large the problem is, i.e. regardless of the length of road on each process, you have to communicate information for a single cell to your nearest neighbours. In this simple case the overhead is therefore independent of problem size. In real calculations it is not...
David Henty replied to eligius Hendrix

Scaling and Parallel Overheads

06 FEB 2018

I have rather simplified things to make the point clear! You're right that Concorde didn't have the range to get to Australia in a single hop.
David Henty replied to Simon Hennessey

Comparing the Message-passing and Shared-Variables models

06 FEB 2018

That's right - using multiple nodes means you really have no choice but to use message-passing. I addressed using both methods in a previous comment - https://www.futurelearn.com/courses/supercomputing/3/comments/25462595/
David Henty replied to Nicola Richman

Memory Caches

06 FEB 2018

You're right that cache coherency is difficult at this scale. The naive approach I outlined in the article - everyone broadcasts all changes to everyone else - doesn't work here. You need to do more sophisticated book-keeping, for example keep a directory of which core holds which data in its cache so you can look it up and go directly to the right place.
David Henty replied to Tony McCafferty

How to parallelise the traffic simulation?

06 FEB 2018

In reality you would use much more sophisticated simulations. You can use them to do short-term predictions about where congestion is likely to occur, which might enable you to prevent it by altering traffic light sequences or changing road priorities. You can also use them for longer term planning, for example trying to find out whether adding an extra lane...
David Henty replied to Nicola Richman

Week 1 summary

03 FEB 2018

These are very interesting questions - could you please repost them in the "Ask an Expert" section of Week 3 so they're in the list of potential topics?
David Henty replied to Bahman Hassanati

How to parallelise the traffic simulation?

03 FEB 2018

This is a very interesting point - I answered a similar question here: https://www.futurelearn.com/courses/supercomputing/3/comments/25664655/
David Henty replied to Sandra Passchier

MARCONI: building a real supercomputer

03 FEB 2018

@JacqR It's really a question of economics - it would be expensive to construct a building using any kind of non-standard parts so bespoke doors would add to the cost. Plus there's the issues of transport, fitting the cabinets on forklift trucks etc. If a manufacturer produced an oversized cabinet then it could severely restrict their market.
David Henty replied to Doug Boniface

Message-Passing Model

02 FEB 2018

@DougBoniface It's always good to discuss things, especially when the answer is not clear! The bottom line is that, in supercomputing, it *is* faster to send one large message compared to many small ones but exactly why this is may not be 100% obvious on terms of the low-level networking.
David Henty replied to Doug Boniface

Message-Passing Model

02 FEB 2018

@DougBoniface I'm not an expert in network hardware but my understanding is that there is an initial setup cost when the processor says "I want to send a message over the network". Once all the setup is complete, the processor can push data onto the network very quickly where it is subsequently broken up into packets. An analogy might be boarding a plane -...
David Henty replied to Tim Powell

Message-Passing Model

02 FEB 2018

@GillianC Modern networks are actually quite resilient against single packet failures, so the packet should be resent via a different pathway.
David Henty replied to Muhammad Faheem Akhtar

Message-Passing Model

01 FEB 2018

That's correct - the overhead of making a connection on a supercomputer network is quite high.
David Henty replied to Doug Boniface

Message-Passing Model

01 FEB 2018

You are right that, at the lowest level, a message is split into independent packets which can in principle take different paths between source and destination to avoid congested areas of the network; if there are errors then packets are resent. It turns out that the major overhead in supercomputers is getting the data from the processor onto / off of the...
David Henty replied to Gillian C

Can we do better?

01 FEB 2018

I've slightly skipped over the details here. For a cellular automaton the updates are supposed to be done as if they were instantaneous, i.e. independent of the ordering. What you are supposed to do is look at all the cars and say "this one can move, this one can't, ..." but without actually moving the cars. You then do a second pass and move all the cars that...
David Henty replied to Pete Taylor

Introducing Wee Archie

01 FEB 2018

@PeteTaylor In our full outreach presentations we explain this a bit further. Although the dinosaur racing game is really designed for fun, it is based on real research - see http://www.animalsimulation.org/page2/styled-4/
David Henty replied to Sandra Passchier

MPI and processes

01 FEB 2018

If there is a node failure then the batch system, which allocates user jobs to compute nodes, will notice this and make sure that no further jobs are sent to the node until it is fixed. However, any programs running on that node will simply crash and fail. Even if your job uses hundreds of nodes (thousands of CPU-cores) a single node failure will cause the...
David Henty replied to Qingfen Yu

Case study of a real machine

01 FEB 2018

Thanks for spotting this error - I've corrected it. The reason I say "around" is that the distribution of compute nodes is not even across cabinets - some cabinets have fewer compute nodes as they include a few system nodes for IO etc. However, as you spotted, 9000 was completely wrong!
David Henty replied to Stephen C

How to parallelise the traffic simulation?

01 FEB 2018

@StephenC This is exactly the approach taken in real parallel applications - issue all your sends and receives asynchronously then get on with your own work until the messages are all complete. Your comment about id-ing the messages is an interesting one - MPI (the standard message-passing library) is designed so that you normally don't have to worry about...
David Henty replied to Jacq R

Top500 list: Supercomputing hit parade

01 FEB 2018

@StephenMarsh A good point and this has been done in practice - see https://www.futurelearn.com/courses/supercomputing/3/comments/25462249/
David Henty replied to Stephen Marsh

Top500 list: Supercomputing hit parade

01 FEB 2018

My understanding of the history is that the US didn't want to sell China large numbers of top-end Intel chips so the Chinese produced their own CPU for their latest supercomputer. You are correct that not all systems will appear on the list, for example the specs of those used by intelligence agencies are unlikely to be made public.
David Henty replied to Mark Goodfellow

Who needs a multicore laptop?

31 JAN 2018

@CharbelSolís This may not be true right now but it could be an issue in the future - new processors may have many tens of cores which might not be of any use to a home user. This isn't unique to supercomputing - many people spend tens of thousands of euros buying incredibly fast cars when, in the UK at least, the legal speed limit is under 115 km/h. Some...
David Henty replied to Doug Boniface

How to parallelise the Income Calculation example?

31 JAN 2018

@DougBoniface The fact that modern processors do not operate directly on memory means that you have to use some additional technique to ensure that two CPU-cores do not try and alter the same data at the same time. One approach is to lock the data - in the solution video I use the analogy of there being a single pen that you have to own to be allowed to write...
David Henty replied to Sudheer Kumar Battula

Memory Caches

31 JAN 2018

You are correct that cache coherency is enforced in almost all modern multicore processors. However, it is really done at the hardware level and not by the operating system. It was this mismatch that is at the root of the recent Meltdown bug - the hardware is caching sensitive data even though the operating system is trying to prevent a user from accessing it.
David Henty replied to Katrina W

MARCONI: building a real supercomputer

29 JAN 2018

A very interesting question! My guess is that you'd need to do some kind of hard reset so that the nodes didn't just reboot themselves but also talked to each other to find out where they were on the network and work out the various network paths between them. However, I would expect the cabinets are physically the same so you could swap them about provided...
David Henty replied to Doug Boniface

What limits the speed of a supercomputer?

29 JAN 2018

You've identified the key issue here. The standard cloud computing model envisages that data is only transferred between the user and the server. Although there may be many servers, they do not communicate with each other. This is because it all came about from internet searching and, as you've spotted, the searches are independent of each other. For typical...
David Henty replied to Mark Goodfellow

What limits the speed of a supercomputer?

29 JAN 2018

@JasonPolyik A useful figure to remember is that, with a 3GHz processor, light only travels around 10cm every clock cycle.
David Henty replied to Nicola Richman

Supercomputers - how are they used?

29 JAN 2018

Following up on what @WeronikaFilinger has said, all simulations have their limitations. Bloodhound SSC is an extreme case, but a more standard example in car design would be crash testing where every new design has to pass safety tests based on crashing a real car. Although safety is the ultimate goal, it's not an issue in the tests themselves as the car will...
David Henty replied to Nicola Richman

Understanding Supercomputing - Performance

29 JAN 2018

I'm afraid that I've been a bit loose with terminology here - I've corrected it to cycles. Thanks for pointing this out.
David Henty replied to Bayu Ahmad

Introductions

29 JAN 2018

Hello Bayu - before I started working at EPCC I did research in lattice gauge theory for several years, which is actually how I started doing parallel computing.
David Henty replied to Fumi I

Graphics Processors

28 JAN 2018

Although GPUs were originally designed for doing graphics, they are capable of general-purpose calculations and can run computer programs that have nothing to do with visualisation, for example using NVIDIA's CUDA language. The GPUs are part of the supercomputer in the same way that the CPUs are. If you look at Step 2.14 you will see that ARCHER has two...
David Henty replied to Tony McCafferty

What limits the speed of a supercomputer?

28 JAN 2018

That's absolutely correct - on a supercomputer node we try and make sure that there is almost nothing running except for the user's processes, e.g. on ARCHER that would be no more than 24 processes (one per CPU-core). The operating system will still need to run a few essential tasks but again we try and keep them to a minimum.
David Henty replied to Henry Saltaren

ARCHER - it's more complicated...

28 JAN 2018

@BartWauters You're right that things have become a little more complicated since 2016 when two existing systems (Piz Daint and Piz Dora) were combined into an upgraded machine. However, the point I was trying to make was that the Swiss system still has a similar number of nodes to ARCHER (around 6,500 vs 5000) but a much higher performance. This must mean...

Harnessing AI in Marketing and Communication

Samuel Johnson’s Rasselas: An Introduction

The Online Educator: People and Pedagogy

How to Succeed at: Interviews

Harnessing AI in Marketing and Communication

Samuel Johnson’s Rasselas: An Introduction

The Online Educator: People and Pedagogy

How to Succeed at: Interviews

David Henty

Activity

About FutureLearn

Using FutureLearn

Need some help?

Popular Subjects

Developing Skills

Small Print