Saturday, May 23, 2015

Questions...

A coworker sent an e-mail with a list of questions that some grade school kids have for people in the field of computer science. The questions are intriguing enough to me that I want to spend some time personally thinking about them and answering them. The questions fall under these categories: (1) personal inspiration, (2) classes or learning, (3) about the computer science field in general, (4) personal projects, and (5) women in CS. The 5th category is because the questions came from a forum focused on girls in school.

Personal inspirations

What got you into coding? What inspired you? ...

I have an uncle who worked in electric engineering. When I was in the third grade, he was the only one in our family with a computer (and a handheld camcorder). My mom brought me to his home after school, while he was still at work, and let me play with his computer, mostly games but once I drew Goemon Impact using MS Paint. I also destroyed many expensive IC chips that he kept in his drawer by my ill-conceived attempt to design my own imaginary circuit board: I didn't know about soldering, so I bent the pins to make the chips stay on the board. He wasn't married back then, so he spent many girlfriendless lonely nights unbending the pins.

For some reason he never reprimanded me. I feel very sorry every time I think back. Note to self: do not let kids play with the stuff in my drawer.

My mother was looking to learn to use the "personal computer," and she borrowed a book from my aunt about how to use MS-DOS. My mom never ended up reading the book, and for some reason the book fell into my hands. Back then, operating systems came with some programming tools, so one of the chapters in the book was about using the assembler MASM and how to diagnose programs using DEBUG.COM. These are programming tools at a very low level, very close to how the actual hardware works.

One day in third grade, my mom brought me to a computer shop with a books section, after some after-school science class that she sent me to. As I was browsing the books, I found one that says MS-DOS and programming in the title. I had no idea what was in it, but as I flipped through the pages, I accidentally ripped a page. It was a minuscule tear, but I somehow felt obligated, so I convinced my mom to buy it. I determined that I would read it enough to understand it completely so I don't waste my parent's money. The book was about the MS-DOS system architecture, and was the second volume of a three volume series. The third volume was about MS-DOS system calls and contained programming examples on how to use them. I eventually got the third volume as well, but I never went back to the first volume.

So as fate has it, my first programming language was assembly in 16-bit Intel 8086. It was as close to the hardware as it could be. I was writing instructions that perform arithmetics using machine registers and jump to memory addresses. When my program crashes, it also crashed the whole system, so I had to hit the reset button to reboot the computer. I did that many times, and the button became very polished. But if you think about it, low level systems were really simple because it had to be. Modern hardware is more complicated, but there is still a subset that is somewhat approachable.

In fourth grade, my mom sent me to a class that taught programming in BASIC. It was a rather high level but unstructured language. I think it was helpful to learn computing from both the high and the low level. Then I went to the middle and learned C, and later C++. The kind of programming I did was all pretty naive. I knew about table lookup, but didn't know about binary search or data structures until college. Data structures are about how to organize data in a computer so they could be inserted, queried, and deleted quickly. Not knowing how to organize data effectively means your program will likely run slower than it needs to. This brings us to the next topic.

Classes and learning

Why do you think it's important to study computer science? What classes should I take?

Before answering questions about classes and learning, we need to understand what computer science is about.

Computers are stupid. They follow your instructions and will do exactly that. Computer science is about how to make computers smart and more effective, so you could do more with these otherwise incredibly stupid machines. What makes computer science a prolific field is that both hardware and software can be cheaply replicated, so a good computer scientist can do great things with computers, and more by adding more computers.

For this to be possible, you need to learn about how to organize data using data structures, how to solve problems with algorithms, some theories about predicting the program's space and time requirements and other performance characteristics, how to program the computers to talk to each other, and finally, how to make sense of the programs you write.

Furthermore, not all solutions are silver bullets. A solution may perform well under a specific condition, but it might suffer a bad worst-case scenario, or it may have other trade-offs. It is important to know them so you can apply the right solution to the specific problem at hand. A college class would most likely be using sorting algorithms to illustrate these concepts. Sorting rearranges numbers so they appear in increasing order, such that the number after is always greater than the number before. Of course you could sort in decreasing order as well, but you only need to make minor changes in your sorting algorithm to do that. Or if you're a clever programmer, you implement the sorting algorithm once that can be instantiated to sort in either order, and can work with data other than numbers.

When it comes to the trade-offs of sorting algorithms, quick sort tends to be the fastest in practice, but could be very slow if the numbers are almost sorted or inversely sorted. Insertion sort is generally slow, but is fast if the numbers are almost sorted. Merge sort has no such "it depends" scenario but tends to be slower because it moves the data around more, and for a simpler implementation, uses more space.

Such trade-offs are prevalent in the field, and some people dedicate their lives to study really subtle trade-offs, when such trade-offs can still cost millions of dollars on really big problems.

Coding can be learned independently any time from books or websites, but I find that computer science concepts are better learned in a formal higher education setting, e.g. in college, and in some cases, grad school.

About the computer science field in general

What do you find most challenging? What do you think the computers will be like in the next 50 years?

Remember, computers are incredibly stupid machines. The computers these days appear smart only because some dedicated computer scientists figured out how to make them so. I am of the opinion that computers can only be as smart as the person programming them; and that even if someone figures out how to make computers program themselves, the cumulative result of computers programming itself to program itself will converge to a finite limit. Of course, that means that computer scientists cannot be replaced by computers, so that guarantees some job security!

The difficulty is trying to figure out what problems there are left to solve. Many problems such as sorting, data structures, networking, and even distributed computing have good enough solutions, which means the likelihood of breakthrough is small. There are some open problems pertaining to computer vision and artificial intelligence. Some problems are interesting in a theoretical sense but difficult to prove. An example is the P =? NP problem: whether a deterministic machine (computation happens in a single sequential universe) has less computing power than a non-deterministic machine (computation happens in unlimited number of parallel universes). It's still unsolved.

Cryptography is another highly anticipated field. It's about information security, so that only the intended parties may exchange messages, and no other parties can figure out what the messages are even if they are intercepted. The assumption of cryptography is that it would be computationally expensive to decrypt a message without knowing some secret. But computers are getting faster, which makes it easier to guess the secret. The challenge there is to find different ways to encrypt a message that still makes it computationally infeasible to decrypt without the secret.

My personal conviction is to improve the way we understand programming so that programs are easier to understand and less error-prone. But again, this is also a very mature field with mixed results. There is some potential for advancement, but it's not clear how it's going to be.

Personal projects

What kind of things do you program? What programming language do you use?

I am currently trying to understand limitations of existing programming languages in order to find new language design or paradigm that would alleviate these limitations. The limitations aren't about the computation power, but it's about how to write programs and reuse them more straightforwardly. I'm writing an assortment of common routings in C++ that might help me identify these opportunities.

Women in CS

What does being a women in CS entail for you? What are some challenges you have encountered as a woman in CS?

Unfortunately I cannot speak for women in CS because I am not a woman. But I love giving the example of Ada Lovelace as the first programmer in history, a woman. She was studying mathematics under Charles Babbage because her mother thought it would remedy a family history of mental illness from her father's side. Charles Babbage designed the first analytical engine, a computer made of gears and shafts. He never built them, but in recent years the Computer History Museum have built one according to his original design and demonstrated that it worked.

Computer science is a gender neutral subject, but as in any field, minority groups may be subject to political discrimination for any number of things from acceptance of academic papers for publication to salary and promotion. Women are not the only discriminatory group. Being a Chinese, I think we are discriminated in some ways as well. Once I submitted a paper for review---it was only single blinded which means I do not know the reviewer, but the reviewer can see my name---and a reviewer made a non-sense comment about my grammar as a reason for rejecting my paper.

I think there is also a difference on the learning process between male and female. My hypothesis is that males are more accustomed to unstructured learning (i.e. figuring things out yourself) while females benefit more from structured learning (i.e. in a classroom with a course plan and assignments). Since computer science is a discipline that arose in the last 50 years or so, which is nascent compared to literature, mathematics, or even natural sciences, we have still yet to develop effective structured education, which may cause more women to struggle through the study. This situation will likely improve especially when teachers start to import computer science into the high school curriculum. College professors are definitely the subject expert, but grade school teachers know more about education theory and effective teaching than college professors.

Closing remark...

If you found this blog post because you are a young student looking to enter the field of computer science, I hope this is helpful to you. If you are a computer science veteran, I welcome your feedback.