What about my blog categories?

I should not be to worry about this topic since this is my third blog entry so far and I am pretty sure there is not to much to categorize with this minimal “blog” production ;(  but “BLOGS CATEGORIZATION” is certainly a good topic to talk about. Well I am not referring to the fact that most blogging platforms support “Tagging and Categorizing” your blog entries such they can be organized and presented in a particular manner for future reference but to the fact that those categories would have a meaning or will perhaps be driving behavior in respect of the particular blog purpose.  oooohh, I guess I just touched a weak point about my blog site, Do I really have a purpose for my blog defined? I guess I should but that is another topic for a later time of inspiration.

Ok, so far without  doing any significant research effort about this matter I have found (empirically) three major categorization approaches.  I know there may exists many more but I will restrict today’s writing to those I believe I would be looking at first.

Write first categorize later.

This perhaps is the most common approach to categorizing since it is the most convenient and your blog platform would certainly help you on the “post’s categorization” mechanics. Do not worry about what you are blogging but just produce, produce, produce … writing material until you get to compile a significant amount of titles and keywords so you can figure out what is your mind being up to all the time. I believe by then it will be fairly easy to make some sort of text mining process to find out; the common denominators or patterns. Be aware that I am not suggesting applying an elaborated text mining algorithm or something similar to find this out, perhaps a manual observation and simple spreadsheet counting analysis will give you the clue. If you are interested you can always read a bit more serious material on how applying text mining and machine learning techniques would be able to help. I found an article published by ACM particularly interesting about this topic that takes into account “unknown” words in the equation.

Blog categorization exploiting domain dictionary and dynamically estimated domains of unknown words. Human Language Technology Conference archive
Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers table of contents
Columbus, OhioPOSTER SESSION: Short paper posters table of contentsPages 69-72   Year of Publication: 2008,  Chikara Hashimoto; Yamagata University, Yonezawa-shi, Yamagata, Japan, Sadao Kurohashi; Kyoto University, Sakyo-ku, Kyoto, Japan, Association for Computational Linguistics,  Morristown, NJ, USA 

Categorize first write later.

This is an approach that maybe used by those blogging professionally since they get paid for their blogging activities it is very likely that an editorial line of writing is being either impose or strongly suggested in order to get paid. Since I am not a professional blogger (syndicated) and I am not getting any money out of writing on my personal blog then there is not editorial line of writing imposed by anyone in particular but that does not mean I can not have one imposed by myself. Having an editorial line is good because it keeps you focused on the topics you want to produce, maximizing productivity while reducing or perhaps avoiding taking time to write about things that are not in your personal script. Yes, in fact this will limit somewhat your creativity and will force you to analyze what is the purpose of your blogging site right up front rather than later which may be a bit cumbersome for many of us to start with but later on it will help you on blog writing planning and to maintain a balance of the different areas of interest while producing your articles. Be aware that you always have the “Uncategorized” category where you can throw over anything you have in mind regardless of any established category and you can always add more categories as you go. What I think it is not a good practice is to move your articles from one category to another once they got already categorized since this may confuse your readers but you can always use “Tagging” to help readers to find your articles using different keywords regardless where do they got categorized.

Do not categorize unless someone complains about it.

This is the easiest one to adopt but also the most difficult to keep up since you will be tempted to apply some categorization to your blog site once it start to exponentially grow which will take you back to point one in the list. If you decide not to categorize and be on the wild writing mostly about anything and becoming the Robin Hood of the blog readers by taking your precious time and providing goodness to the poor of us in need of sapience and expertise then you do not want to categorize all your articles to find out that you have more categories than the New York Times but you will certainly take advantage of “Tagging” your work so the piles of articles you generate can be easily found in the time to come and not only the few displayed in your main blog site page getting hit by your readers. Tagging text in itself is not easy task neither but is something you can randomly do and change at any time without serious implications to your blog’s search ability. You may be thinking of what a paramount of work would it be tagging all your articles all the time while you can be writing stuff instead then there it is some experts advise on the topic such as applying “Automated Text Tagging” by Ben Scofield or if you are more attracted to a probabilistic way to do this you can always read about what some researchers suggest to tackle this problem.

A Hybrid Probabilistic/Connectionist Approach to Text Tagging Technical Report: #930115 Year of Publication: 1993  

Julian E. Boggess

Lois C. Boggess Mississippi State University  Mississippi State, MS, USA 

Last but not least …

As always any hints you may provide to approach this “categorizing” dilema would be very helpful so please feel free to drop a comment or two. I promise to publish all comments except for the one’s looking like SPAM or from a doubtful source.

Comments are closed.