A Little Here, A Little There: July 2017

21 July 2017

[Data Science] Whatever I Know About Data Science

Data Science is the subject of gathering insights from data. Yes, it is a fancy name for Statistics.

However, it seems to me that Data Science is commonly interpreted it as the hybrid of two subjects: Data Analysis and Computer Science, and the weight for the latter seems to be heavier. I will share why I think it is so.

You see, much of what that has to be accomplished in data science is by a computer.

Probability and statistical tools such Student t-Test, Chi-Square Test, that are required in data analysis, can be easily computed using software such as R - no more computing of t- (or z-) score then refer to a table.

Data visualisation is also made easy by using R's ggplot2 or Python's matplotlib or seaborn - there is no need to draw graphs and scale by hand.

Not forgetting data manipulation and transformation. These can be done easily by using Python or R too.

However, let's not forget the ultimate aim of data science - that is to gather insights from the data. Hence a data scientist has to be good in both - programming and statistics. I opine that he does not need to be overly good with programming, just enough to gather, clean, manipulate and analyse the data. But I would place more emphasis on the statistics part because it is statistics that will allow the data scientist to quantify the findings.

Here I list the resource I learnt from.

Statistics

As an engineer, I learnt statistics from JC till University. Also, my Master's degree in Industrial and Systems Engineering is heavy on statistics, although I do not think my statistics is very strong.

Nevertheless here are the resources:

Statistics for Experimenters: My lecturer sweared by it because it is written by legends. This book is so classic that it only has two revisions, but it is easy to read because the concepts are delivered in real and relatable experiments.
Applied Statistics and Probility for Engineers. This is another textbook, more like a supplement to my course.
The site Seeing Theory - A visual introduction to probability and statistics is a cool site to learn about the topic graphically.

In short: basically any probability and statistics textbook or resource that you can get a hold on. The ones I listed are the those I used.

Programming

I had exposure to R because my master course was very statistics-orientated. R is free and open source and has a big community supporting it, and it does the work. SPSS and SAS are simply too expensive to be accessible in my opinion. One of my lecturers used R to do bootstrapping and it got me interested to want to learn more, and I had been wanting to learn a programming language. R seems to fit the bill. Hence my inclination is towards R when it comes to data science.

I learnt R mainly from DataCamp, and tons of trial and error. You have to pay, a subscription, which I did, if you wish to access the more in-depth courses. Otherwise, the free way is to learn R in R using SWIRL. R also has a resource to learn probability and statistics, aptly called Introduction to Probability and Statistics Using R (IPSUR). RStudio, and along with it Shiny and RMarkdown, is a powerful IDE for R.

I got to know Python along the way while researching on R and Data Science. I started learning the basic syntax from Codeacademy. Thereafter I learn from courses in Udemy. The Python for Data Science and Machine Learning in Udemy is especially a good primer to learn data science in Python. i discovered that Python is a very easy language to learn. Compared to R, its syntaxes are more elegant. Also it is more versatile when it comes to big data, and it is able to be integrated with other languages/applications.

To really learn Python to be functional, that is able to write functional scripts, I would recommend the book Automate the Boring Stuffs. I have also read many books on Python such as Python for Data Science for Dummies, and Python Programming for the Absolute Beginner.

I also got to know about Kaggle and KDnuggets. Both are very good sites to browse for data science related information.

Machine Learning

I will regard machine learning as a computer science subject although the statistical counterpart is call statistical learning. Anyway, statistical learning are normally implemented using a computer and therefore machine learning is a more familiar term.

Machine learning is basically using a computer to identify patterns and then make predictions, without the hard coding the logic. In a way it seems like the machine learns about the data. It is a broad but interesting topic.

I would strongly recommend these two resources:

The video lectures on the book, An Introduction to Statistical Learning with Applications in R, by the authors of the book, who are Stanford University professors. Sit through the 15hours worth of lectures. It is worth it.
Andrew Ng's Machine Learning Course in Coursera. Coursera, founded by Ng, has a wealth of courses. However, his own course is the must-go to for machine learning. I paid for the certification and it is worth it.

Projects

I think the fastest way to learn is to really do it hands-on. Sure the courses and all will have exercises and assignments, but I do not think it is enough. I would pick up data sets that I think it would be interesting to analyse, and or I have certain personal project that I wish to implement in a programme.

I have:

Continuous Learning

Well, my reading never stops. These are the books that are in the queue:

But I am not a Data Scientist, yet.

~Huat

19 July 2017

[Investing] Netlink NBN Trust Debuts Today - Did You Huat?

So, the very hyped Netlink NBN Trust debuts today. The ticker is CJLU.

"NetLink opened trading at S$0.815 per unit before edging down to S$0.810 on its first day of trade." - From CNA (link above)

All in all, it went up half a cent, or abt 0.6%. By the way, this 0.6% will be wiped out by transaction cost during sell-offs.

Sparked by curiosity, I found out that Shareinvestor has a wealth of information. I found data on the historical IPOs and their performance. There are impressive winners, like those with stock price increase more than 100% (e.g. UnUsUal, Samurai); of course there are impressive losers too. It seems that, by eye-balling (i.e. not validated), there are more losers than winners.

I regard IPOs as a form of speculation. Firstly, there are market participants with the aim to flip some profits from that fact that IPOs generally will gain a decent amount during the first week of opening. This usually works if one has a lot of capital. It does not make sense for small timers like me. For example, if I could only afford $1,000. A 45% increase would just be $450. On the other hand, it would be $45,000 for a person with $100,000. Sometimes absolute numbers are important.

Secondly prospectus are always very optimistic, but always remember that prospectus are a way to gather capital. They are no different from a sales brochure. There will be much ha-has about how their business is going to do well, how good their leadership is, and so on. Fortunately, the only good thing that come out from the prospectus is the numbers. Numbers do not lie. Analyse the numbers carefully to see if the investment is worthy.

Finally, there is no certainty that the IPO will flip say 20% or 40% or even -10% when it opens. We can hope. But there is no certainty. I wonder how it will do for the next few days, especially there could be a massive sell-offs by opportunists who hope to profit from a spike from an IPO during opening day.

Since we are talking about certainty. In the link from CNA, there is a line I quote:

"As a business trust, the future cashflow is predictable, so there is a lack of imagination on this kind of IPO,"

Just to note: As an long term value investor, I appreciate certainty and predictabiluty; I do not need/want to be imaginative. I look for stocks that is profitable, financially strong and sustainable with a healthy cashflow.

For CJLU, my view still holds: I am not going to consider it for sometime, even it means a missed opportunity. You can read about my opinion in the previous blog.

~Huat

10 July 2017

[Investing] IPO Review - Netlink NBN Trust

There is much hype about this IPO because firstly, it is going to raise $2.3 billion, the biggest since Hutchingson Port Holdings (IPO in 2011: US$1 , current price as of writing: $0.45), and is dubbed as one of the 'blockbuster' IPOs in 2017.

Netlink is not a stranger to household. They were formerly known as OpenNet. Our broadband is set up by them. They (probably) own the optic fibre network here.

A good resource of the 'boring' points are can be found here.

The salient points are:

Each unit is going for S$0.82
About 2.9 billion shares is going to be issued
Total Market Capitalisation is about S$3B
Average EPS is 1.5 cents
Cash is S$92M, Debt is at S$1.6B
Dividend Yield is about 5%
Use of proceeds - 1) purchase Singtel's assets and 2) repayment of S$1.1B loan to Singtel

Normally, my analysis will cover FInancial Strength, Profitability and Sustainability. But I have many questions, and hence red flags, just by listing the points above, even if I have not read the prospectus.

A Very High P/E

Just from the points above, the P/E of the trust is going to be at 54x. One conventional way to look at P/E is that it is the number of years for our investment to break-even. In this case it will take 54years. I look at it's recipocal which is E/P, the earnings yield, it's going about 1.8%. That is, for every dollar invested in this trust, it accounts for 1.8% of the earnings.

The high P/E could also mean much hype or expectations for the trust.

To me anything that is greater than 30 is too high, although there's really no hard and fast rule for P/E.

Too Little Cash, Too Much Debt

As stated, there is only $92M of cash but $1.6B, or $1600M of debt that's almost 20x. Even if the proceeds from the IPO is used to pay $1.1B of it, there remains $0.5B, or $500M. It is still 5x of cash!

From the EPS, I infer that there will be challenges repaying that debt just by the business operations, unless they grow there business, which is my next point.

Lack of Growth Plans

It seems that there are no growth plans (or maybe there is). It is stated that the 'fixed residential wired broadband household penetration stood at 88% as of Dec 2016'. I suppose there is still 12% potential. I did not see any other plans like R&D, or expansion overseas.

Sustainability of Dividend

I highly doubt the sustainabilty of the dividends at 5%. A 5% dividends at S$0.80 translate to about $0.04. Recall that EPS is about 1.5 cents, thats S$0.015. I do not know where the trust is going to top up the money.

A Cash Bump for Singtel

It seems that most, if not all, of the proceeds are going to Singtel. Not only does Singtel receive money for the loan to the Netlink, but it is also able to 'dispose' of some of its assets for cash. If you ask me, Singtel seems to be a big winner here.

Conclusion

I believe the red flags I have listed are enough to deter me from participating from this IPO. I could still participate in this IPO, and make whatever earnings after it is launched, but I will not hold it for too long - not with a P/E that high and not when I cannot figure out how it is going to sustain the dividends. I am saying - there are other opportunities around. Remember that after 6 years, Hutchinson Port Holdings is now at 45% of its IPO value.

Other Reference:

http://www.theedgemarkets.com/article/blockbuster-listings-pipeline-singapore-2h17-deloitte
http://www.straitstimes.com/business/netlink-nbn-trust-set-to-be-biggest-ipo-in-singapore-in-six-years-with-pricing-at-81-cents
https://www.shareinvestor.com/fundamental/factsheet.html?counter=NS8U.SI

~Huat

01 July 2017

[PSA] Battle of the Milk

In this post, I explore the costs of various milk.

My son weaned off his mother's milk about 3 months ago and we have been feeding him formula milk. I have been wondering why formula milk is so expensive. Recently, there was much controversy about the price of formula milk powder in Singapore. Older folks will remember KLIM. Even older will remember drinking condensed milk. Some people think all milk is the same, but if this is so, why the prices are different? Is it solely because of marketing?

My wife and I have been contending to switch from formula milk to fresh milk. Our concerns also include whether our boy would accept it and whether if he is sensitive/allergic to the fresh milk. Besides the formula milk, we tried giving him fresh milk. Recently, we are trying a "less-branded" formula milk. Luckily, he seems receptive to all.

Left to Right: Pura Fresh Milk (1L), Dumex Dugro (700g pack), Enfagrow (900g tin)

Mead Johnson's Enfagrow 3
900g
S$45.30
Serving size: 22
Cost per serving: S$2.06

My son has been drinking this as a supplement with mother's milk. There are actually two flavours available - original and vanilla; I learnt about this because I accidently bought the vanilla one without knowing after much later.

Pros: Well know brand (best selling internationally somemore). Nutrition is superior that the other two.
Cons: Not sure if the superior nutrition is absorbed by the boy. And pricey.

Pura Fresh Milk
1L
S$3.60
Serving size: 4
Cost per serving: S$0.90

There are many brands of fresh milk available. Milk of Australian and/or New Zealand origin is preferred. Other brands that can be considered are Farmhouse, Paul's and Marigold. Meiji, by the way, is from Thailand, although it is a Japanese brand.

Pros: Fresh. And cheap (cheapest).
Cons: Need to replenish regularly, but not too much because of the shorter shelf-life. Need to refrigerate, THEN warm it before feeding.

Dumex Dugro Stage 3
700g
S$18.90
Serving size: 18
Cost per serving: S$1.05

We are trying this only recently because of the 'hassle' of fresh milk. The nutrition level is not as high as the Enfagrow, but that's OK because our son is eating solids too, and there's always fish, pork and other goodies inside. By the way, I grew up drinking KLIM.

Pros: Seems like a balance of the other two options - cheap and convenient.
Cons: Can't think of it yet, except that it is cheap to have some psychological effect.

To me, there is a lot of psychological warfare in this formula milk powder thing. 'You mean you can't part with $2 per feed, to provide the best for your child?'.

As parents, we want the best for our children, but we have to be realistic about the price we pay and the benefits it can bring to our children.

There is the price factor and then there is the nutrition factor. Enfragrow has almost double of nutrients such as DHA and Choline, than Dumex Dugro. (Double of those things, double the price; seems legit) I am not an expert, but I have doubts that my child is able to absorb all those nutrients in 1 feed of a formula milk (educate me if I am wrong, please!).

We are lucky that our boy is eating well, not too picky yet, and is receptive to the milk 'experiments' we subject him to. And he is not allergic to both the fresh milk or formula milk. So fortunately for us, a cheaper alternative turns out to be a good middle ground.

~Huat