More challenging projects every programmer should try

More challenging projects every programmer should try(web.eecs.utk.edu)

757 points by pavehawk2007 5 years ago | 223 comments

henning 5 years ago |

As someone who has played with writing trading bots but never traded them with real money, some advice: if your results seem too good to be true, they probably are. Your trading bot may be doing unrealistic things or its results may not be reliable if the following are true:

- You are trading in a market with low liquidity or one that is controlled by a small number of market participants. I'm not an expert but I think this would apply more to markets like penny stocks and less to big markets like forex for major currency pairs

- You are not taking transaction costs into account or not doing so properly

- Your bot makes a low number of trades, making the results close or equivalent to lucky coin flips

- Your bot is simply making trades that cannot be executed, or may be doing simulated trades of something that is not actually tradable. This applies to a large number of research papers that assume you can just buy and trade the S&P 500 itself. You can trade ETFs that are tied to an index but an index is not a tradable instrument in of itself. Once you realize this, a lot of papers seem very weird

- You are not modelling other aspects of the trading process realistically, such as assuming the bot has infinite funds to trade, allowing it to take unlimited losses and continue trading when in reality you'd be hit with a margin call and your trading would be stopped

- Your code is committing any number of data snooping errors where the bot is asked to trade at time A (say the open of a trading session) but has access to future data (say the closing price of that day, future data that would not actually exist in a live environment)

- Depending on what you believe about how market conditions change over time, your bot may have worked in the past but would not work if used today. I.e., the market may have adapted to whatever edge your bot may have discovered

There are probably lots more pitfalls I don't even know about since I'm not an actual trader.

I'm not discouraging anyone from playing around or trying things, of course. I think it's great fun, which is why I do it.

Here's the good news: if you realize you don't actually have an edge and avoid risking your hard-earned money, you come out ahead of almost all people who ever trade.

atomicnumber3 5 years ago | |

I worked in this industry. 2 more common issues:

* Doing a latency-sensitive trade when you don't have good execution. It's easy to go wild in simulation and think you can flip in and out of positions. But if you're a retail trader (and in this context, by that I mean "not connected directly to the exchanges, at the minimum")

* Not taking into account the impact of your own trading on markets. This is obviously impossible to really simulate. Sometimes you can ignore it (if there's enough liquidity) but I've seen trades that looked great on paper and when trading at small-ish sizes, but then when you try to crank it up and do more volume, prices run away from you.

Obviously, there is money to be made in algo trading. It's big business and obviously not everyone is doing crazy latency sensitive stuff- there are quant trades that you could probably do with execution available to retail traders. And honestly I wouldn't be surprised if some retail traders manage it. But I will say that I think for an individual, it's not worth the effort even if you are ones of the ones who can consistently make a profit. Just buy sp500 ETFs and sit on them and do something else with your time.

IgorPartola 5 years ago | | |

Would you mind if I asked you a question since I have never worked in this industry but did play with crypto trading a while back. Before Mt Gox was shut down I was trading on a couple of tertiary small exchanges and at the time there was a lot of talk of arbitrage between different exchanges and how transaction latency and fees made it very risky at best and a losing proposition in most cases. But what I was wondering about is whether in a situation like that (one large exchange dominating the market, several smaller exchanges trading the same commodity) if it was possible to use the large exchange as a sort of oracle. Essentially my hypothesis was that Mt Gox sets the price and other exchanges follow on a delay so if I watch Mt Gox I can predict where the price on the secondary exchanges will go a few seconds to a few minutes ahead of it moving. I ran a bunch of historical data through some basic analysis scripts and noticed that indeed there was a pattern but when I actually implemented a bot to trade BTC it lost money more often than not. I am curious if (a) that idea has any validity and (b) was me losing money on this strategy due to latency and implementation errors or due to some fundamental principle of trading that I am missing.

tim333 5 years ago | | |

>prices run away from you

I've found that stuff a lot. People forget that on the other end of the price is a thinking human or robo delegate thereof trying to outsmart you. It's really hard to say what will happen until you try it with real money / assets.

UnpossibleJim 5 years ago | | |

I see a question, so I'm sorry to ask another but as someone who hasn't done algo trading I'm curious. How quickly are you actually able to trade. I understand with low level trading, through a broker, the trades can be pretty instantaneous. On penny stocks, though, sans broker and with just an API or website that you set up with a field bot, how quickly can you expect for these trades to go through, really?

I hear these horrid tales of people killing themselves over bad trades and lines of credit and I wonder how much is poorly written code and how much is the chaos and whim of the market (don't worry, I'm not going to try algo trading).

sitzkrieg 5 years ago | | |

same here, aits definitely worth pointing out (as you me tion) simulating execution and realisitic fills (and dont forget rt commissions!) is extremely difficult and also will absolutely kill strategies that casually look profitable

pjc50 5 years ago | |

I worked on a unique HFT system which was capable of starting the response packet before the incoming packet's final byte had arrived. Negative latency. If it mispredicted the future, it would simply scramble the trailer and cause the packet to fail UDP checksum.

It didn't make money in the real world because the quality of decision that could be made at that speed wasn't good enough.

djoldman 5 years ago | | |

This was an open secret. You can also send the start of a datagram on every incoming packet and then abort if you don't want to add/modify/delete an order by splitting your datagram into multiple packets. This worked on some exchanges because they ordered add/modify/deletes by when the beginning of the packet was received as opposed to the end.

If you do too much of this you will get angry exchange network people yelling at you as they may consider it spam/ddos. Some exchanges explicitly limit this.

It's also worth considering the possible gain. If you're in the fpga (0-150ns) or cpu(300ns-3us) space, the math can come out differently.

tijsvd 5 years ago | | |

This is not unique, it's standard practice. Many exchanges send large UDP packets with the most valuable information at the front. Or the packet is structured such that you can make an informed bet based on size and first sub-message.

Failing the checksum was sort of common as well, but exchanges don't like it.

These days most tricks have to do with avoiding as much serialisation time as possible, e.g. by sending ahead part of the TCP payload and filling in with some innocent order if there was no opportunity.

inopinatus 5 years ago | | |

From time to time there comes a remark that is the absolute epitome of Hacker News, and I mean that earnestly and without any backhanded rancour, and I thank you for this flawless specimen.

KMag 5 years ago | | |

To get such changes reliably with low latency, I presume you were modifying the DMA buffer, without any syscalls or other mechanisms to synchronize between the CPU and the NIC's processor(s), right? Did you just set up the race condition such that when you got bitten, it resulted in a failed checksum, or was the time window so large that in practice the race condition never bit you?

_huayra_ 5 years ago | | |

Wow this is wild! I never thought things would get so latency-sensitive that one would have to overlap request and response!

KMag 5 years ago | | |

That's super clever!

lordnacho 5 years ago | | |

What was the rest of the infrastructure? Was this at a proper HFT firm with everything else streamlined?

imtringued 5 years ago | |

The reality is that stock market trading is like a normal business selling products except you provide liquidity instead of products. Now imagine the market you have chosen needs 1000 widgets per day. If there are competing factories in the market place that already produce 1000 widgets then your only chance of breaking into that industry is by displacing existing competitors. You need to have a better product (or what the finance guys call an "edge"). The vast majority of people don't have such a product idea. They would probably just produce the same stuff the big guys have been making and then wonder why it isn't making them a profit. The market is already saturated with those widgets and there is no need for more factories.

Replace widget with liquidity and factories with traders and you should immediately see why most traders are losing money. They are operating in markets where their liquidity simply isn't needed.

If you want to make money off of trading then you have to find a market with low liquidity where having more traders is actually welcome but why do hard work if you can just invest your money and still get good gains without working?

ddls 5 years ago | |

Because

- management fees, margins, and available capital can easily be modelled properly

- you can easily set up constraints (like no fractional trading for your S&P example)

- you can set up a proper point-in-time database to avoid snooping (especially if you're using earnings reports or other fundamental data which is often actually released _after_ it's published release date...)

- you can set up regime-shifting simulation environments (various market conditions)

- you can avoid over-fitting if you're back-testing (with dozens of techniques, most notably : test once and forget about parameter optimization)

I would say that with paper trading and back-testing the serious problems are that :

- your orders don't show up on the book so no one sees and reacts to your limit orders

- your "filled" orders don't affect the book, so you're not affecting liquidity, so the market doesn't change in response to your trading

- your bot has no access to market micro-structure strategies and conditional orders (and if you want to trade fast or are placing big trades you need them)

These are the problems that make any simulation unrealistic, and they are fundamental. It's shadowboxing, which is not entirely devoid of value, but which is certainly insufficient on its own.

(I've worked as a quant developing strategies for several funds these past 15 years)

gjvnq 5 years ago | |

I tried making a bitcoin trading bot and almost all of the money (about 10 bucks, not no big problem).

My main mistake was using the historical trades instead of the historical offers as a testing dataset.

kamel3d 5 years ago | | |

did it work at the end? I want to make a crypto trading bot seems simpler than stock one

deanCommie 5 years ago | |

> You can trade ETFs that are tied to an index but an index is not a tradable instrument in of itself.

Out of curiousity, what's the difference?

contingencies 5 years ago | |

Good list. You forgot "over-fitting your algorithm to historical data".

simias 5 years ago |

I would add "build a toy regex engine" to the list.

A couple of years ago I implemented a toy regex engine from scratch (building NFAs then turning them into DFAs). I thought it was an enlightening experience because it showed me that the core principles behind regular languages are fairly simple, although you could spend years optimizing and improving your implementation. How do you deal with unicode? How do you modify your implementation to know how many characters you can skip if you don't have a match in order to avoid testing every single position in a file?

It demystified the concept of a regex engine for me while at the same time making me realize how impressive the advanced, ultra optimized engines we use and take for granted are.

anonytrary 5 years ago |

This article is aimed towards students. It's great advice for students who are in college, know very little, and want to improve their CS skills.

It's poor advice for someone who already has a STEM degree and wants to build something useful and profitable. If you already know how these things work, your time is better spent on the "edge of the circle": http://matt.might.net/articles/phd-school-in-pictures/ which applies to businesses and startups as well.

If you're in the latter group -- you've already got the skills to build real shit. Don't waste your time on homework problems. Find a problem you have and build a solution for it. Don't listen to people who tell you to work on homework problems that have already been solved; it's a complete waste of your time if you already know the fundamentals.

As for stock trading bots -- if you don't have a mathematics degree or equivalent (e.g. having incredible math skills), don't even bother. You won't be profitable, and you will learn nothing useful in the process, because you will approach the problem as a naive CS student would. Smarter people than you have made trading bots and have failed miserably. Without having an extremely strong foundation in mathematics, your trading bot will amount to nothing more than a futile exercise in gluing APIs together.

fergie 5 years ago |

I would strongly recommend building something that you yourself think is cool, and not feeling that you have to conform to what other people tell you to do.

ma2rten 5 years ago | |

This comment is like the liar paradox, if people do what the comment suggests they are conforming to what another person tells them to do.

RivieraKid 5 years ago | | |

It's not, the comment's point is - don't do things you don't find particularly interesting just because other people suggest it.

watwut 5 years ago | |

Most people draw blank when they are trying to come up with project idea. They are programmers, not creative after all.

Which is why lists like this help.

anonytrary 5 years ago | | |

If you are trying to think of an idea, you're already doing it wrong. The best ideas are motivated by problems you encounter yourself, not by trying to think of ideas. This is the biggest mistake you see with 20-something founders. Best way to come up with ideas is to use existing solutions and realize how shitty they are. Pretty much every single company was formed this way.

> They are programmers, not creative after all.

Damn, this is a bold comment on HN. Since when does profession determine how creative you are? I've met "programmers" and "geeks" who are more creative than "artists". Your profession has little to do with your innate creativity -- it just determines how you are able to express that creativity.

Unpopular opinion, but lists like this are stupid for people who are trying to build companies. You have to try things and be pissed off at the status quo to find real problems. Nobody is going to find real problems for you, in the same way that no quant school is going to reveal their hedge fund's trading strategy to you. Finding ideas in a list is the last advice I would give to anyone. If it's public, it's probably not a profitable idea.

mukwenhac 5 years ago | | |

Being a programmer doesn't mean one is not creative, those two are not mutually exclusive. Not to mention that one can observe a problem that can be fixed by a way they already know too.

mitchdoogle 5 years ago | | |

Programmers literally create applications, features, games, whatever. You don't need to be making art to be creative. You just need to create something, and programmers do that all the time. It takes a lot of creativity to solve problems using code.

purplecats 5 years ago | |

why not, that's only one direct step removed from thinking that conformity to popular beliefs is cool.

jameskilton 5 years ago |

I would also recommend, if someone is interested in games, to do Tetris. It's a simple concept that is trickier than expected once you have to figure out the details of how it all comes together.

xamuel 5 years ago |

Here's an open-ended programming project which, in a certain formal sense, spans the entire range of all difficulty levels: write an "intuitive ordinal notation" for as large of an ordinal number as you can.

What is an "intuitive ordinal notation"? Definition: The set of intuitive ordinal notations is the smallest set P of computer programs with the following property. For every computer program p, if, when p is run, all of p's outputs are elements of P, then p is in P.

So "End.", the program which immediately ends with no outputs, is vacuously in P (all of its outputs are in P, because it has no outputs). It notates the ordinal 0. Likewise, "Print(`End.')" is in P, because its sole output, "End.", is in P; it notates the ordinal 1. Likewise, "Print(`Print(End.')')" is in P, notating the ordinal 2. And so on.

The above can be short-circuited: "Let X=`End'; While(True){Print(X); X=`Print(\`'+X+`\')'}". This program outputs "End.", "Print(`End.')", "Print(`Print(`End.')')", and so on forever, all of which are in P, so this program itself is in P. It notates omega, the smallest infinite ordinal.

Here's a library of examples in Python, currently going up to a notation for the ordinal omega^omega: https://github.com/semitrivial/IONs

avl999 5 years ago |

Building a distributed key value store is a fun project and lets you learn tons of real world world stuff. It's a great excuse to get a survey on grokking the design decisions required to build a distributed system and it will truly help one understand why No SQL DBs scale easier than relational ones and the kind of tradeoffs they make to achieve that.

jackschultz 5 years ago |

Instead of a stock trading bot, go for daily fantasy sports contests. It can cover pretty much all parts of programming.

Web scraping to gather data, databases for storing it, ML for analyzing, front and backend web dev to show the daily information and adjust.

And instead of having to deal with trading regulations, contests can be really small and easy to enter. There are daily contests for 5 cents an entry, and you can enter 150 optimized lineups from an uploaded csv for $7.50 a day. You can really learn a ton.

nsomaru 5 years ago | |

Which services would you recommend which are developer friendly?

Could you describe your approach? What type of information would you scrape?

ashleyn 5 years ago |

Write a toy compiler for a basic like language, you'll learn about what your languages are actually doing.

bob1029 5 years ago |

The database project is quite the rabbit hole if you start chasing performance. I have learned some amazing things about just how fast a 3~4ghz CPU core actually is from this journey.

timvisee 5 years ago |

Fantastic list!

By the way, adventofcode.com is currently ongoing. Though the challenges are easy compared to the projects in this list, I highly recommend it. It covers problems you might face in big projects. With these small puzzles it's easy to experiment. It prepares you for bigger things.

userbinator 5 years ago |

IMHO a text-based browser isn't exactly in the "challenging" category, as it basically amounts to stripping all the HTML tags out and doing some very simple transformations (like replacing <br>'s with newlines.) Then again, one of the things I've been working on intermittently for the past few years is a graphical (CSS2+) browser, which is definitely in the challenging category. There are some other public efforts too:

https://github.com/lexborisov/Modest

https://github.com/litehtml

https://github.com/ArthurHub/HTML-Renderer

Along the same lines, some other challenging projects I recommend are to write decoders/renderers for existing formats like MP3, MP4, PDF, etc.

SoSoRoCoCo 5 years ago |

I wrote a raytracer in 1996, and then a year later used Intel's VTune to speed it up. Just removing unused "return" statements gave me 3x speed increase. Apparently Borland C/C++ wasn't very smart back then.

A fun project I did after that was writing a AI frame language to do goal-stack problem solving, specifically with path finding. I connected it to the ray tracer and made movies of spheres having wars. (I used an unlicensed DivX encoder to stitch together thousands of GIFs.)

thealig 5 years ago | |

That sounds fun. Do you have code for reference? Also I want to try out building a raytracer sometime as a hobby project, is it advisable to learn some 3d graphic concepts like WebGl and so?

carl_dr 5 years ago | | |

Check Ray Tracing In One Weekend out : https://raytracing.github.io/books/RayTracingInOneWeekend.ht...

You can very quickly get something on the screen with it, although getting an intuition for how it works may take a little longer and some reading around the concepts it introduces. But it does let you focus on the maths rather than worrying about also learning how Web/OpenGL works too.

djeiasbsbo 5 years ago |

For some simpler projects, I can only recommend doing some digital signal processing. For example, an audio signal is just a list of values, so you can do things like:

- Count the number of zero crossings - Find out where they are - Create any shape of wave by adding together multiple sine waves - Hard clip the signal - Stretch a signal and interpolate it with new samples - Invert and revert a signal

For level 2, you can start processing "live":

- Create a sine synthesizer - Create a small ring buffer of samples - Find out how to output that audio (system audio, soundcard) - Add MIDI support - Add polyphony support

DSP gets hard once it has to be in real time and the latency has to be minimal. It's great exercise to mess around with it.

jamil7 5 years ago | |

I came across this area recently while trying to build an iOS app that could reliably detect knocks or taps on the body of the phone, I was ultimately unsuccessful but I learned a lot about collecting and analysing microphone and gyroscope data. Also built some nice tooling for collecting live iOS sensor data and transmitting it to prometheus/grafana over mqtt.

djeiasbsbo 5 years ago | | |

That sounds interesting. A cool project would be to inspect gyroscope data when the iOS keyboard is in use. Maybe it would be possible to create a keylogger based on it.

I guess it would depend on how good the data is and which keyboard is being used.

I know that this is possible on Android.

0xbadcafebee 5 years ago |

> it is really simple to create the basic "database". You can start by using the dictionary data structure that comes with whatever programming language you're using and slap a web API on top of it.

Better yet: do it in C. There's no "dictionary" object type so you have to make it yourself. You'll soon learn a whole bunch of fallacies about how those "dictionaries" actually work. After you spent a good deal of time doing that, you can switch to authentication/authorization, logging, storage, tracing, API management, resource quotas, and a raft of distributed computing issues.

I recommend basing it on Consul, it has a better general model than etcd.

Waterluvian 5 years ago |

Writing a Game Boy emulator has been the most fulfilling and interesting programming project in my life.

I love, most of all, how modular the project is. I can do an hour here or there and make meaningful progress.

I'm really eager to discover other very large programming projects that break down into sensible bites so well.

windowojji 5 years ago | |

I worked on a game boy emulator a while ago to learn C. I did pretty well and was able to run most games I came across just fine, but never got audio working...I just couldn't understand any of the stuff online about how it worked, and was unable to find any good resources that walked through how audio emulation in general worked, much less game boy's version.

shhsshs 5 years ago | |

I had never considered an emulator to be anywhere inside of the realm of possible projects I could take on for fun. I just thought it would be too complex without having a ton of very specific knowledge.

Your comment prompted me to go look up some other attempts, and I’m really glad I did. It seems much more approachable to me now. Thanks!

Waterluvian 5 years ago | | |

Consider starting with a CHIP-8 emulator. It's a perfect size for starting out and touches on a lot of important concepts you'll apply to the next emulator.

rdescartes 5 years ago |

I would recommend choosing a long enough time (e.g. 3 months) to contribute an open source project you are using, especially you are not familiar to that domain. I learnt a lot from modern compiler stuffs by contribute to rust-analzyer.

tracyhenry 5 years ago |

A great compilation: https://github.com/danistefanovic/build-your-own-x

082349872349872 5 years ago | |

Building your own dogfood (if you use it daily, code a quick-and-dirty one) is a good way to learn that by holding the right end of the exponential, 80-20 solutions can easily be closer to 1-99+ in terms of manpower and code size.

(the flip side of 80-20 is: "all systems are fault tolerant, it's just that in most of them, the human is the component which tolerates the faults")

Of my IT daily drivers, I've done toy:

    - web browser / server
    - email client
    - document formatter
    - text editor
    - window manager
    - 3D / 2D graphic slicers/rasterizers w/ alphabetic fonts
    - shell
    - interpreters / compilers
    - operating system
    - VHDLish CPU
    - various data encodings (Hamming, MFM, etc.)
    - discrete transistor logic

(when I was just starting to program, I discovered the home directory of a colleague of my father's contained many experiments of this kind, and reading his work taught me C)

rex64 5 years ago |

I recently went through the process of creating a ray tracer project from zero for learning purposes. It was a humbling and eye-opening experience. I've written an article[0] to explain my process in detail if you're interested.

[0] https://alessandrocuzzocrea.com/how-i-made-a-ray-tracer/

eatonphil 5 years ago |

I'd also recommend writing an emulator for real or fake (e.g. CHIP-8) hardware. It seems complicated but the core loop gets pretty simple. It ends up giving you a much better view of both assembly and pointer semantics (useful for better understanding C).

azhenley 5 years ago | |

That’s in my original list from last year!

eatonphil 5 years ago | | |

Ah, good call. I missed the link that this was a follow up post.

mellosouls 5 years ago |

Related discussion from the original suggestions a year ago fwiw:

https://news.ycombinator.com/item?id=21790779

forgotmypw17 5 years ago |

I would add a basic feature-complete website which works on every mainstream browser starting with Mosaic.

It's much easier than it may seem, architecting it is interesting, and there is a lot of "last 10%" stuff which keeps it fun as long as you keep going.

In the demystifying area, it demystified HTML and JS history for me, forced me to use with a minimal toolkit, and taught me how to build "modern" JS features in ways which will not break browsers which don't know how to do them or have them disabled.

acutesoftware 5 years ago |

Writing even an extremely simple game without using a game engine or dedicated game library is quite an eye opener.

Make a small 2D platform game, and it covers so many areas (and it is a lot of fun!).

xwdv 5 years ago |

Being able to do challenging projects is a cold comfort when by far the most challenging project I’ve ever faced was trying to build something people would pay.

mooreds 5 years ago | |

Ah, but that isn't a programming problem (or at least, usually not primarily a programming problem). It's a product and marketing problem.

username90 5 years ago | | |

Being a good programmer helps though, technical challenging things tend to have much smaller and worse competition.

Like, you probably wont make a lot of money making another 2d platformer no matter how well you code, they are so easy to make that there are millions of them out there already. However if you make a performant and bug free factorio or minecraft clone you will at least get a few thousand people try it and from there it would grow if it is fun.

preommr 5 years ago | | |

Learning what problems to solve, how to allocate resources, how to deal with building things from end-to-end, etc. are programming problems.

yudlejoza 5 years ago | |

You're being downvoted despite being a realist, having integrity, and putting things in perspective.

But I can tell you there is no shortage of oversellers in the programming community.

A programmer who's really a sleazy salesman at heart would jump at figuring out a fizbuzz solution on their own for the first time in their life, and would proceed to mark it as the biggest achievement in the history of mankind, make a webservice out of it, shout at the top of their lungs on social media, and would start knocking on VC doors. (hint hint: many of them actually get away with something not too different).

Avtomatk 5 years ago | |

I face the same problem, I think that programmers can be skilled in a lot of different tasks, but for large projects, our two hands and our brain are not enough, no matter how smart you are, you need co-founders, loneliness leads you to procrastinate eternally.

altitudinous 5 years ago | |

I completely agree. Writing apps for the app store is the perfect example. I started out coding well coded apps that didn't make money, and as time has progressed my apps have lower code quality but I make more money. I focus less on code quality now and think more about marketing and appeal.

quickthrower2 5 years ago | |

In addition my concern with this is that it’s totally ok in my opinion to not do any of these and I doubt for most people doing these projects will make you better at your job (unless you happen to be doing this as your next project), so if you want to tune out in your spare time that’s ok too.

arendtio 5 years ago |

What I am really missing is some kind of real-time AI. A decade ago, I have coded some bot for an ego-shooter with RTS elements and have learnt so much from it (while having a lot of fun).

It starts with basic things like waypoint systems vs. area awareness systems plus the relevant routing algorithms like A*, but goes on to organizing a group of players and finding good strategies. And all of that with a limited time budget and an changing environment around you. Last but not least, you want to emulate human behavior which is probably the hardest part as it includes changing you behavior according to your situation (don't run straight against a wall for 10 seconds) but also taking into account the weaknesses as e.g. humans can't aim perfectly.

Granted, what I have done has a huge field of challenges, but even with a 2D engine I think you can learn a lot from the experience.

drummersbrother 5 years ago | |

You may be interested in the game Screeps. From their website (name +.com): "It's an open-source game for programmers, wherein the core mechanic is programming your units' AI. You control your colony by writing JavaScript. "

joycian 5 years ago | | |

I looked this up and it seems very cool. Do you (or anybody else) know any more of these games or environments that can be controlled by code?

fspear 5 years ago |

I would add an emulator to the list.I've always struggled to figure out how these are built from scratch.

A long time ago I wanted to code a neogeo emulator but gave up before I even started, I didn't have a clue where to begin.

I am amazed at anyone that can code an emulator from scratch.

sterlind 5 years ago | |

they're really simple, in their basic form and for simple ISAs!

it mostly boils down to keeping a bunch of registers and a giant switch statement. Each case simply implements the opcode. You have an array of bytes for the memory, and some emulated devices (e.g. trigger a screen update when the framebuffer memory gets changed, or set the instruction pointer to a handler when a key gets pressed.)

It gets hard when instructions need lots of decoding, or you have 3d graphics hardware to emulate, or if you have something like a BIOS, or if you want to JIT.

I'm actually eyeing implementing a console on an FPGA as my holiday project, with something like Chisel.

schoen 5 years ago | | |

It seems like a lot of the complexity may also come in when you have important circuits to emulate other than the CPU. In that case you'd have more to worry about in terms of timing, dataflow, and synchronization. And some platforms might conceivably have analog circuits that play an important role too, although I guess you might be able to abstract that out by trying to create a rough digital functional equivalent even if it isn't completely faithful to the behavior of the analog part.

schoen 5 years ago | |

My friend who works on the MAME project told me that it's a very friendly platform for adding new emulators (whether you want to add an emulation of a game, a device, a platform, or whatever) because it provides a good structure to work with and a lot of tools to describe recurring patterns in electronics and systems. That might be a good middle ground to start with -- trying to add support in MAME for some device that's not presently emulated there. (Apparently the MAME project really appreciates such contributions, since they think of themselves as pursuing mainly a historical preservation and documentation mission, and would always like to see more devices covered.)

Although that's not "from scratch" because it would still be using their libraries and plumbing, it could be "from scratch" in the sense that you might take an environment with 0% support for emulating a certain device or system and build it up to having 100% support. So it seems like a nice way to start.

tuankiet65 5 years ago | |

Their previous article mentioned it: https://web.eecs.utk.edu/~azh/blog/challengingprojects.html.

the_cat_kittles 5 years ago |

another one related to stock trading, but perhaps more interesting- build a simulator for a sport. both baseball and darts lend themselves to markov models, and are simple enough to simulate in some detail. with darts, you can get very close to as accurate as possible. baseball has more weird complications because of the rules. but its fun to do, and to compare to old games to see how well your model does.

person_of_color 5 years ago | |

have you made money?

kunalpowar1203 5 years ago |

Thanks for following up with this list after your previous one. I spent a great deal of my time (including some office time) on writing a Chip-8 emulator thanks to your previous list :D

whatever_dude 5 years ago |

My own favorite is an equation parser.

Before attempting to do so I thought it was implemented as a simple seek over the string, maybe a bunch of regex stuff. I guess it can be done that way, at the cost of growing complexity; but the proper solution (with a stack, etc) is so elegant (makeing it easy to add functions, operators, parenthesis, variables, etc) that it really makes one appreciate the value of good, thoughtful engineering.

mraza007 5 years ago |

Interesting projects. I might try ray tracing in python as I’m also exploring a-lot about CGI lately.

Has anyone tried CGI if so how’s your experience has been so far

trustfundbaby 5 years ago |

What do folks think about implementing a web crawler that you can send to a website and it indexes every internal url on the site. I remember sitting down to write one 100 years ago now, and finding it to be much trickier than I thought it would be.

tsjq 5 years ago | |

that's interesting. grab only the URLs, not the content?

trustfundbaby 5 years ago | | |

right, but of course to do that, you'd have to grab the content to parse it :)

cghendrix 5 years ago |

These look fun!

mjgs 5 years ago |

Awesome couple of articles.

person_of_color 5 years ago |

How about designing your own virtual memory system

tsjq 5 years ago | |

that's like Prof Frank Mueller level

4778468d 5 years ago |

>> automate testing on historical data over long periods of time

I want to try this. Where can u get access to historical pricing data that includes pricing changes during the day, not just end of day prices?

known 5 years ago |

https://en.wikipedia.org/wiki/List_of_lists_of_lists FTW