Anna Gray

Anna Gray

Broadcast Systems Engineer : design and operations
w e b 2 0 at i w e u a dot c o m

Location U.K..... mostly....for now, at least.


  • Yield a little bit like C's Static??

  • My only consideration : should I make one function taking radius and height, which returned two values, Area and Volume, or two functions, one returning Area and one returning Volume.

    I chose two functions for function clarity and simplicity ( to my C programmer's mind )

    round ( CylinderArea (radius,height), 2 )
    returns 471.24
    round (...

  • Anna Gray made a comment

    Python... default argument values : genius!
    It will tend to push complexity into the functions which need to be thoroughly tested. The more complex the function with the more parameters it can take, the harder it is to test under all conditions.

  • I have been learning Python for about a year, never came across *args / **kwargs and was weirded out a little as I thought they were pointers or something. Some of the ( many ) mysteries of Python are evaporating quite quickly with this course.

  • Anna Gray made a comment

    Python is just so cool !
    I guess it can get you into lots of trouble with it too!
    ( old C coder talking )

  • Run with the argument ' -h' ( h = usually for 'help' )
    >python -h

    usage: [-h]

    optional arguments:
    -h, --help show this help message and exit

  • The ipnb files are Jupyter Notebooks files. They are the course work files seen in the following videos that you can replicate on your laptop in a Browser or by installing Jupyter labs on your local machine : HOW?

    To try quickly in your browser

    goto :

    A new page will open :

    There is...

  • Yep : so you can type
    python -test

    The output will be something like
    ['', '-test']

    Now we can check if args are passed when running the script
    if sys.argv[1] == "-test":

  • Azure is a huge environment, isn't it?
    I hadn't quite realised until now.

  • Five crucial elements
    1. Azure identity services : Let the right people in, the wrong ones out
    2. Security tools and features : Security is vital
    3. Privacy, compliance and data protection standards : Best practice
    4. Secure network : Essential though most N/W traffic now secure
    5. Monitoring and reporting : To see what is and isn't working

  • GDPR was / is a EU directive with the UK now retaining the legislation and in many cases adopting a more rigorous approach. I don't think anyone wants to go near trying to unpick it.

    To untangle data and geo locate it for many international companies is well nigh impossible, and who would / what authority in reality would / could try to audit that, and...

  • Anna Gray made a comment

    I note that products such as Solarwinds and Zabbix monitoring instances can be built on Azure as 3rd party monitoring.
    These can report on bandwidth, memory, cpu usage, hard disk capacity, load balancing issues etc, vital useful on high load or high traffic peak, real time systems.

  • The use by default of password complexity rules, coupled with the widespread adoption of password managers ( Lastpass/Keepass/Browser ) has reduced the compromise of accounts significantly. Limiting access to accounts by the use of IP whitelist rules also reduces the compromise surface. That is not to say it still doesn't happen, just that it shouldn't in...

  • 30 Nov 21 : got it working after some time

    Install the CLI on Windows and Linux ( I chose Ubuntu CLI )
    I chose : Option 1: Install with one command
    then to run the program
    $ sudo az login
    You will be asked to navigate to

  • @SörenLairdSörries @DaveGunn

    The term Container in software and IT is a little nebulous, but they are essentially a ( ideally single ) software process running / depending / utilising a larger operating system superstructure.

    Thus multiple containers can run on a single Windows/Linux machine ( eg see Docker ) Much like individual shipping containers...

  • Really good introduction to Azure and demonstrates just how straightforward 'spinning up' an online resource can be

  • Top left of Azure control panel screen
    Home>VMName>Networking>Add inbound port rule >
    Source : Any
    Source Port : * (a http request can come from a large range of ports )
    Destination : Any ( We can send traffic to any public IP Address )
    Service : HTTP : Port 80
    Priority : default ( 330
    Name : Port_HTTP

  • The ( currently ) single Azure data centre for the whole continent of Africa is located in South Africa. You would be right to think that some provision should be held off the west coast catering to Ghana,Nigeria, Cameroon and Gabon for example. Currently no.

    I assume there just isn't the demand,that current West African country network infrastructure...

  • Samuel, you raise an excellent point where dedicated machines in your industry, healthcare, manufacturing etc have high functionality, working, stable software, often running on legacy operating systems ( windows / embedded linux etc ) that can't be put 'in the cloud'.

  • The division of visual diagrams into 4 key categories,
    is very useful as an immediate guide on which type of diagram is optimal to display the data and conclusions you wish to draw from the data. thank you

  • The Promo of 50% doesn't work, or is way to generous. Promo days resulted in similar sales quantity ( coffees served / day ) as other days but half the revenue, they didn't increase number of coffees sold ( significantly )
    Shop doesn't need 8 staff, especially Mon-Wed
    Little correlation between Avg Temperate and Hot/Cold coffee sales
    Busiest days are Friday...

  • We moved on to Energy futures?

  • Anna Gray made a comment

    Wow : I feel I learnt a LOT from ZERO to running basic SQL queries, something I have been meaning to learn for about 20 years, and I had to think it through, and only scratching the surface.

  • select customer.firstname,customer.lastname,sum( from customer
    inner join Invoice
    where customer.customerid = invoice.customerid
    and ( customer.customerid = 1
    or customer.customerid = 6
    or customer.customerid = 59
    group by customer.customerid
    order by -sum(; /* The negative for descending order...

  • I can see how a real SQL query could become quite complex, quite quickly.

  • Anna Gray made a comment

    Got stuck here...for a couple of days.. moved on. Subqueries and Joins from multiple tables needs more work and more documentation....

  • Anna Gray made a comment

    Just spent a really engaging few days ( about 5 ) getting to grips with Sqlite, filters, code and the sheer power of it. I am very impressed while realising I am just scratching the surface of the power of this tool.

    I am fortunate that I code ( C / Python ) also so can see parallels / different ways of achieving tasks, and do appreciate that SQL provides...

  • So a personal learning experience. I spent about 4/5 days late October writing some Python code to filter the flights.csv file and was successful

    Now I have spent about 4 days learning SQLite and have two identical CSV files of filtered departures for LAX, days 5 ( time >= 1700 ) /6/ 7 ( time < 1200 )
    I have learnt a lot about big files and the power of...

  • Anna Gray made a comment

    select firstname,lastname from customer where customerid = 42;

    customer id is an INTEGER, match can be with integer, no quotes required

    select employeeid,firstname,lastname from employee where address like '%77%';

    address is NVARCHAR(70), thus LIKE search term ( 77 ) is in quotes

    sqlite> select count(customerid) from customer where postalcode is...

  • Fascinating to be working on real but sufficiently complex dataset

    @Richard R : Good spot '<=' is correct, =< does not work

  • A potential for confusion is that the download from github gives two sqlite files, a .sql and a .sqlite.
    The chinook_sqlite.sql is human readable using a text editor ( sublime text etc ) . The .sql file is in effect a .dump of the database file

    In SQLite you can read the sql or open the sqlite

    sqlite> .read Chinook_Sqlite.sql
    sqlite> .open...

  • @AndrewB @BolajiAhmed :
    what extent this actually happens in the 'real world'.
    Amost never : data would be added directly to the SQL DB, only by DB maintenance or analysis IT devs manually in this fashion.
    The majority of manual data entry into is done via Web interfaces which allow / provide software data error, validity and form checking ( formats of...

  • @AndrewB : I wonder to what extent this actually happens in the 'real world'.

  • sqlite> .mode csv
    sqlite> .import flights.csv flights
    sqlite> .tables
    sqlite> select count(year) from flights;

    Cool : took a couple of days working around SQLite to get to this point, still getting the hang of adding a ; to the end of NON DOT commands, and importantly NOT adding a Semicolon to the end of DOT commands as you end up with...

  • SQLite is BIG!

  • I am going to have to reread this a few times I think!
    This is important and fundamental.

  • Why would it make more sense to use a separate integer value for a primary key
    Format constraints for PK and error checking capability prior to Query of PK
    Data security : PK not based on P.I.I. ( eg name / DOB )..GDPR
    Immutable : The key never has to change, other aspects of the record can be
    Zero potential for...

  • There is a LOT of data out there to collect, with the 'data' and the complexity of it evolving all the time.
    More complex models will be required to capture and link related data together ( I guess ! ) to allow relationships between data points to be found, mapped and information derived.

  • Anna Gray made a comment

    Scatter plot for this data is pure noise with zero correlation between height to weight, zero, even when boxed off to 5/10 cm intervals.
    This can't be a data set for sports anything, and unlikely Olympian women who would typically come in around 55kg for a 165cm female athlete ( eg Laura Kenny ).
    Even if this data was for a specialised sport that is...

  • Great to see the FT open-sourcing its data presentation template as best practice

  • Seeing the wood from the trees can be hard : visualising information in more and more interesting ways has become very useful, interpreting the data remains something of an art.

  • Anna Gray made a comment

    Didn't know pivot tables before this evening

  • I can see that with numbers as easy to read as a 24 hour time, written in 2/3 digit numbers 730, not 0730, or a month in text ( January, February etc ) converting this into numbers which are easily sorted can be hard in excel.

    Also calculation in base 7/12/50 or 24 ( 0830 - 0930 is not 100 but 60 minutes ) for example

  • Anna Gray made a comment

    Enjoyed the Excel left/right exercise, long time user of spreadsheets with quite complex formulas ( references, offsets, rolling averages etc ) but not used those

  • From sample-flights.csv
    I have 213 flights departure LAX all days 1-7
    88 flights departure days 5/6/7 LAX
    47 flights LAX between
    Day 5 departure after 1700 ( Tail N8654B )
    and departure
    Day 7 before 1200 midday ( tail N364AA)
    sorted by departure time

  • Wow, that was a learning experience
    =VALUE(RIGHT(B2,4)) for the year in numbers
    =LEFT(B2, SEARCH(" ",B2)) for the Month in Text
    =LEFT(B2,SEARCH(",",B2)-1) gave me eg April 23 from April23, 1983
    =VALUE(RIGHT(C2,(LEN(C2)- SEARCH(" ",C2) ))) gave me 23 from April 23
    9 from July 9
    As you had to calculate the length of string between the Space and the end...

  • @PeterTurnbull @RichardR

    They made the same mistake in the last course confusing Variance Population with Variance sample

    Variance Population = 240 / 9 = 26.66 ( we all agree )
    Sum of (Mean - vale ) ^ 2 = 240
    n Population = 9

    Variance sample which has been quoted as the answer
    = 240 / ( Population- 1 ) = 240 / 8 = 30

  • I think we are looking for patterns here and outliers. Is there a destination, airline or time of day that is 'problematic' with extra delayed or cancelled flights?

  • An hour lost on this : is it me or is there an error and confusion regarding the sample set and quoted Variance calculations

    For data
    Sum = 867
    Mean = 57.8
    Sum of square ( xi - 57.8 ) ^2 = 7686.4

    n = 15
    n-1 = 14

    7686.4 / 15 = 512.43 ( Variance : Population )
    7686.4 / 14 = 549.03 ( Variance : Sample...

  • All three average values, mean, median and mode of a data set can often be more misleading than useful, especially if the data is not especially coherent or not 'normal distribution' / 'bell curve' in shape

  • I think measure of Time in general is one of those special categories, where time itself is continuous, we break it down into seconds, minutes, hours...decades etc. It is also determinate, 1700 will come an hour after 1600 as Wednesday comes after Tuesday.

    We do represent time in numbers, discreet blocks and we can quantitatively measure a period of time,...

  • Companies like McDonald's, Pizza Express, Starbucks etc must have heaps of data on this type of thing where the restaurants are as identical as they can be in terms of customer offering, same menu, same ingredients etc, but, also differ a great deal in terms of location, staffing size of premises, and from a very simple, top down, business point of view,...

  • Anna Gray made a comment

    I took a couple of days to figure out some Python code to process / filter the flights.csv file which I really enjoyed as a personal learning goal.

  • Focus on : flights departing after 5pm on a Friday and before 12pm (midday) on Sunday, DEPARTING LAX
    This will allow the flights.csv file to reduce in size significantly, from > 5 million rows to a few thousand.
    I'm Just not quite sure how to do that!

    Update 43309 lines I think, courtesy of a couple of days of python coding

  • Python and SQL are core tools for any analyst's tool box, crunching through big files ( data sets ) very easily.

  • I noted 'helps your natural pattern-recognition abilities ' which is totally sensible. It will be interesting to see how AI evolves in this space which is adept at spotting faint patterns, correlations etc in seemingly un-connected data sets

  • Excellent point, the old computer rule of Garbage in, Garbage out applies to data evaluation and analysis.

  • DDDM, by definition is based on analysing data that exists, has ben collected, structured and thus has already aged to some degree, an hour, a day, month or year etc.

    Decisions of their nature determine the future, and while historic insight is important ( how did we get to here ), it might not provide insight into 'where are we going?'.

    There is a...

  • @ShirleyPhillips : Penny dropped while I was cycling. I am guessing we are not supposed to be able to open flights.csv with Excel, it is simply too big.
    We will be taught Python or SQL to do that

  • The file flights.csv is huge, 5,819,080 rows . Day 31, month 12 starting at line 5,805,948, representing ~13,000 rows of data / day!
    You can use Sublime Text to view the file in raw text. Libreoffice Calc / Excel has a row limit of 1,048,576 rows.

  • Can the use of such a system justify the application of the right not to be subjected to automated individual decision-making?

    The question is necessarily complicated by the 'NOT'

    To rephrase ( if only for my own clarity )
    Should a person have decisions made, that could affect their future, by a computer / algorithm, based on data they have uploaded to...

  • This area of law, morals and public debate is still under development.

    A key aspect of the Mario Costeja González was that the argument that the original article in the newspaper should be retrospectively redacted was thrown out.

    It is still a matter of public record, in the newspaper that Mario Costeja González was subject to court proceedings, it...

  • Data that is difficult or impossible to change that could, if put together enable identity theft ( DOB / Place of Birth / Current complete address / other biometrics that cannot ever be changed by the Data subject ( me/you ) )

    Full Name
    Exact DOB
    Place of Birth
    Current Address
    Extrapolated information of other family members

  • The data owner of the opinion would be the author of the opinion, not the subject of the opinion.
    Other areas of law cover subsequent ( inadvertent or requested ) release of this data, ( libel/ slander/ discrimination ) etc.

    The data ( Employer opinions ) is private and confidential and cannot be released, companies or individuals could be sued if such...

  • All browsers offer the ability to
    1 delete cookies,
    2 to block cookies
    3 Use extensions like uBlock to limit exposure
    4 More usefully, only allow cookies for the 'session'.

    Why would you choose 4.. Sites need to know that you have logged in and authenticated, banking and financial sites, but it is sensible for you to tell your browser to forget this...

  • Occasionally, its representatives tell visiting parents that personal data of their children, including names, home addresses and genders, are processed.

    Transparency :
    The parents should as best practice be informed automatically when initially submitting data, the scope of all future data processing purposes of their children.

    In the event that that...

  • Many of the larger organisations the Googles / FBs etc have an automated process to release your data, reducing the very real cost of processing.
    The cost of processing and release of data to subjects by data holders can of course be reduced by companies, other entities, by ensuring that they hold very little data on data subjects, or even purging PII data...

  • Prior to the internet, the concept of data privacy, PII, identity theft was the stuff of obscure spy novels.
    No longer.
    Identity theft, loss of financial reputation, is common place and costing the public millions a year.
    Companies have till recently been extremely lax about storing customer PII, considering this database a free resource for them to use...

  • A colleague recently conducted a Survey using one of the online Surveys. The Output of the Survey included IP addresses of the participants, but no other PII.
    An individual flagged this, IP addresses are ruled as PII.

    In direct answer to 'Would you be able to indicate whether you will have to acquire data subjects’ consent'
    It would appear that erring on...

  • The UK is fully aligned with GDPR regulations. Trade, irrespective of it's nature, will still occur between UK and citizens and entities of EU states. This data must be protected with the same level of safeguards as if the data had not left the EU.

    Regardless of political statements, legal alignment and safeguarding of data will still be necessary between...

  • Anna Gray made a comment

    1) Why is it important for you to know about the General Data Protection Regulation?
    Professional obligation as a professional working in and around IT, to my team, colleagues and customers

    2) What do you know already about this legal instrument?
    The general overview, the importance of ensuring data is kept secure, the importance of obfuscating PII as a...

  • 7 May 2020 Nigeria 2020 budget: OPS backs proposed $20 oil price benchmark

    I had not read this when I posted

  • @DavidMcGovern
    David : I was referring to the Maths and the laws of thermodynamics. When we invent a lossless system that can store and then deliver energy at 100% efficiency, your statement will be correct, but we do not have that in batteries or mechanical systems including air compression systems or pumped storage eg Dinorwig.

    For every Watt of energy...

  • Anna Gray made a comment

    A CBA of smart grids is never going to be favourable to the technology without a 'true' cost of the alternative.

    The alternative ,burning fossil fuel, which is very cheap to do, can be done at huge scale, very low complexity and proven, coal, gas and even diesel oil generators.

    The true value of smart grids is they try to minimise and then optimise...

  • @MariaGrigg
    It is vital to mindful of why we modern optimisation technology, embodied as computers, algorithms, inter connectivity, data transfer etc is so important in our modern world, specifically for energy generation and delivery, the course subject.

    Not so long ago we ran coal fired power staions 24/hours day at full tilt, lucky to deliver 35% of...

  • Anna Gray made a comment

    Page 88 really highlights the technical challenges of mass, at scale, grid storage with none of the technologies greater than a few 10s of MW.
    That raises the question for me, is grid storage the way to go?

    Should we be looking ( given currently available technology ) at more local resources, where homeowners EVs and domestic battery storage devices are...

  • @MarkTulley
    Mark, Modern grid managment is really important for frequency stabilisation : I am guessing the Battery system will be a Fast Frequency response system, which helps take out the spikes the grid can be prone to when resources ( especailly renewables ) come on and off line.
    Gridwatch is a really useful realtime resource showing the real data of...

  • There is no doubt electric 'everything' is the future, especially as renewables are really gaining traction in the energy mix and contributing significant amounts to nation states as we speak.

    The US is just waking up to offshore wind and catching up very quickly with European initiatives in the North Sea, some great podcasts on the US wind industry here :...

  • I think the public is beginning to appreciate 'total cost' of energy more and becoming more receptive to pay the full cost of having a cleaner environment.

    Renewables do have a significant cost in terms of the extra layer of management and technology that needs to be added ( Storage / supply diversity / sophisticated scheduling ) to ensure stability.


  • Mark : it is hard to know the exact situation, and I wonder if some of the narrative about turbines being switched off is entirely accurate.
    As I glance at right now, Monday 11th May at midday, I see that 48% or UK energy right now is supplied by Renewables, 18% from wind, 21% from Solar, 9% from Biogas.
    The maximum peak...

  • @IanBlack
    Ian I am a little confused. The Carbon Cycle is well understood chemistry the proof all around us in the form of sun,oxygen, carbon dioxide and plants.

    If we dig a little deeper into the earth, we find all the carbon that has been stored over the earth's entire life in the form of long chain hydrocarbons, coal, oil, gas.

    We are doing a...

  • from :

    'develop captive professional fleets (this model allows for the station to be profitable thanks to high utilization rates) with an investment aid'

    It looks like the initiative will start with large fleets of vehicles owned by entities such as local government, or large...

  • It seems to be common in the media and advertising to label and market many forms of energy systems as 'clean', 'emission free', 'low carbon', without actually telling consumers the full picture of the energy cycle.

    Electric Cars for example are certainly not emission free, and arguably are more polluting if the electricity used to charge them came from...

  • Maria : it looks like a few people are a little confused with the role of Steam Reforming and its place in the renewable discussion.

    Steam Methane Reforming is not part of the renewable energy cycle at all, it just happens to be the current large scale process for making hydrogen for specialist industrial use.

    Steam reforming takes a perfectly good gas...

  • Paul, that is a slightly unfortunate statistic and refers to one particular method of creating hydrogen, cracking methane CH4, into 2H2 and one Carbon + Ai/ Stream > CO2, that is used for current production. This technique would not be used in the renewable model of Hydrogen.

    The renewable model of hydrogen will rely on Hydrogen production using...

  • The Hydrogen cycle only works from an economic and Carbon point of view if all the electrical energy used to create Hydrogen ( and the by product, Oxygen ) from electrolysis of water, is from non stored renewable sources ( wind and solar )

    In this way and only this way ( given current technology ) can the energy produced from renewables be converted into...

  • There is no CO2 produced by burning hydrogen.
    Two hydrogen molecules burn with one Oxygen molecule ( air ) to create H2O : water

  • Anna Gray made a comment

    The challenge for biogas is that it requires a significant input of energy to collect the raw material, which by definition is usually a waste product of one kind or another, usually crops.
    Sweden has a vast land mass, I am not sure how much is dedicated to crop production and how much can even be used for crops the further north.
    Sweden is also a high...

  • @DavidMcGovern
    I did a quick 'back of the envelope' for 'fun' this morning.
    Take one French nuclear station at random : Golfech
    It generates 18,000 GWh / annum ~ 2GW constant output over 8760 hours

    Lets take the latest LG panels, the...

  • @ShaunSimmons
    It is easy to look at data centres as a highly concentrated consumer of energy, wondering what it all does.

    If you look at it the other way, ask, what are these data centres actually doing, without being too starry eyed about it, they are enabling our modern lives.

    Whether that be scheduling your supermarket delivery,...

  • @AlanHayman
    Alan, completely my pleasure : best wishes

  • The numbers in a country like Nigeria are staggering and I don't think anyone really knows where to start.

    Much of the core investment in Europe this last 2-3 decades has been from private companies and public / private partnerships, the expertise lies within the private sector, the requirement in the public sector.

    Significant investment needs scale,...

  • @StephenIkechukwuOgbonnaya
    Thanks Stephen, both examples listed above demonstrate local, state level political forces, lying behind national energy decisions, and how those decisions can sometimes be influenced by short term, political expediency.
    best wishes

  • Anna Gray made a comment

    Lots of really good, current information at Drax.

    'Drax Power Station is the biggest renewable generator in the UK and the largest decarbonisation project in Europe'

    I do not work for Drax or any related company..

  • Bio fuel sources just slot into the 'Carbon cycle'

    With fossil fuels, the carbon as a core constituent of extracted fuels, were just plants that grew, absorbed CO2, died, decayed, got buried over millions of years and essentially captured carbon from that early earth atmosphere.

    We come along this last 100 years, dig it all up, Gas/Coal/Oil, generally ...

  • Vestas ( I do not work for them or any company related ) do publish a fairly complete breakdown of components and sustainability.
    We can presume that other manufacturers have similar energy payback periods for their turbines.

    The Vestas Sustainability Report 2019 page...

  • see below

  • The primary material to build a wind turbine are the steel structure and the blades.
    There is no shortage of steel, it has a long life and can be reused / melted down again at EOL.

    The blades do currently present a problem, the material is usually a composite / resin / fibreglass which is non recyclable.

    Vestas as an example have a commitment to improve...

  • @PeterMoore
    The UK ( in particular ?? ) seems obsessed in blocking developments, infrastructure, power, transport links, you name it, unless it is a new big supermarket.
    The discussion in Scotland regarding power lines has some merit, there are overhead lines all over the Alps in France, exactly because of the hydo.

    If Scotland seeks...