#!/usr/bin/perl

=pod

=head1 NAME

tv_tmdb - Augment XMLTV listings files with themoviedb.org data.

=head1 SYNOPSIS

tv_tmdb 
       [--help] [--quiet]
       [--configure] [--config-file FILE]
       [--apikey KEY]
       [--with-keywords] [--with-plot]
       [--movies-only] [--actors NUMBER] [--reviews NUMBER]
       [--stats] [--debug NUMBER]
       [--output FILE] [FILE...]

=head1 PURPOSE

tv_tmdb reads your xml file of tv programmes and attempts to find a matching entry for the 
programme title in The Movie Database community-built movie and TV database. 

Access to the TMDB system uses their API interface to access data in realtime. 
Therefore you must be online to be able to augment your listings using tv_tmdb.

Using the TMDB system requires an API key which you can get from The Movie Database website
(https://www.themoviedb.org/). The key is free for non-commercial use.

You will need to get this API key B<before> you can start using tv_tmdb.

=head1 PARAMETERS

B<--apikey KEY> your TMDB API key.

B<--output FILE> write to FILE rather than standard output.

B<--with-keywords> include tmdb keywords in the output file.

B<--with-plot> include tmdb plot summary in the output file.

B<--actors NUMBER> number of actors from tmdb to add (default=3).

B<--reviews NUMBER> number of reviews from tmdb to add (default=0).

B<--movies-only> only augment programs that look like movie listings (have a 
4 digit E<39>dateE<39> field).

B<--quiet> disable all status messages (that normally appear on stderr).

B<--stats> output grab stats (stats output disabled in --quiet mode).

B<--configure> store frequent parameters in a config file (apikey, actors).

B<--config-file FILE> specify your own file location instead of XMLTV default.

B<--debug NUMBER> output info from movie matching (optional value to increase debug level: 
2 is probably the max you will find useful).

=head1 DESCRIPTION

All programs are checked against themoviedb.org (TMDB) data (unless --movies-only is used).

For the purposes of tv_tmdb, an "exact" match is defined as a case
insensitive match against themoviedb.org data (which may or may not include the
transformation of E<39>&E<39> to E<39>andE<39> and vice-versa).

If the program includes a 4 digit E<39>dateE<39> field the following
matches are attempted, with the first successful match being used:

1. an "exact" title/year match against movie titles is done

2. an "exact" title match against tv series

3. an "exact" title match against movie titles with production dates
within 2 years of the E<39>dateE<39> value.

Unless --movies-only is used, if the program does not include a 4 digit
E<39>dateE<39> field the following
matches are attempted, the first succeeding match is used:

1. an "exact" title match against tv series

When a match is found in the themoviedb.org data the following is applied:

1. the E<39>titleE<39> field is set to match exactly the title from the
themoviedb.org data. This includes modification of the case to match and any
transformations mentioned above.

2. if the match is a movie, the E<39>dateE<39> field is set to themoviedb.org
4 digit year of production.

3. the type of match found (Movie, or TV Series) is placed in the E<39>categoriesE<39> field.

4. a url to the program on www.imdb.com is added.

5. the director is added if the match was a movie or if only one director
is listed in the themoviedb.org data (because some tv series have. 30 directors).

6. the top 3 billing actors are added (use --actors [num] to adjust).

7. genres are added to E<39>categoriesE<39> field.

8. TMDB user-ratings added to E<39>star-ratingsE<39> field.

9. TMDB keywords are added to E<39>keywordE<39> fields (if --with-keywords used).

10. TMDB plot summary is added (if --with-plot used).

11. The top TMDB reviews are added (use --reviews [num] to adjust).

=head1 HOWTO

1. In order to use tv_tmdb, you need an API key from themoviedb.org. 
These are free for Personal use. You need to create a log-in with themoviedb.org
and then click on the API link on the Settings page.
(See https://www.themoviedb.org/documentation/api )

2. run E<39>tv_tmdb --apikey <key> --output myxmlout.xml myxmlin.xmlE<39>
or 
E<39>cat tv.xml | tv_tmdb --apikey <key> tv1.xmlE<39> 
or etc.

3. To use a config file to avoid entering your apikey on the commandline, run
E<39>tv_tmdb --configureE<39>
and follow the prompts. 

Feel free to report any problems with these steps at https://github.com/XMLTV/xmltv/issues.

=head1 BACKGROUND

Like the original (pre Amazon) IMDb, "The Movie Database" (TMDB) 
(https://www.themoviedb.org/) is a community effort, and relies on people 
adding the movies.

Note TMDB is I<not> IMDB...but it's getting there! As at December 2021, TMDB 
has over 700,000 movies and 123,000 TV shows while IMDB has approx 770,000 movies 
and 217,000 TV series. However there are bound to be some films/TV programmes 
on IMDb which are not on TMDB. So if you can't find a film that you can find 
manually on IMDb then you might consider signing up to TMDB and adding it yourself.

=head1 BUGS

We only add movie information to programmes that have a 'date' element defined 
(since we need a year to work with when verifing we got the correct
hit in the TMDB data).

A date is required for a movie to be augmented. 
(If no date is found in the incoming data then it is assumed the program 
is a tv series/episode.)

For movies we look for matches on title plus release-year within two years
of the program date. We could check other data such as director or top 3 actors 
to help identify the correct match.

Headshots of the actors are possible with the TMDB data, but the XMLTV.dtd
does not currently support them.

=head1 DISCLAIMER

This product uses the TMDB API but is not endorsed or certified by TMDB.

It is B<YOUR> responsibility to comply with TMDB's Terms of Use of their API. 

In particular your attention is drawn to TMDB's restrictions on Commercial Use.
Your use is deemed to be Commercial if I<any> of:

1. Users are charged a fee for your product or a 3rd party's product or service 
or a 3rd party's service that includes some sort of integration using the TMDB APIs.

2. You sell services using TMDb's APIs to bring users' TMDB content into your service.

3. Your site is a "destination" site that uses TMDB content to drive traffic 
and generate revenue.

4. Your site generates revenue by charging users for access to content related 
to TMDB content such as movies, television shows and music.

If any of these events are true then you cannot use TMDB data in any part of your 
product or service without a commercial license.

=head1 SEE ALSO

L<xmltv(5)>

=head1 AUTHOR

Geoff Westcott, 
Jerry Veldhuis

=cut
	

use strict;
use warnings;
use XMLTV;
use XMLTV::Version "$XMLTV::VERSION";
use Data::Dumper;
use Getopt::Long;
use XMLTV::Data::Recursive::Encode;
use XMLTV::Usage <<END
$0: augment listings with data from themoviedb.org
$0 --apikey <key> [--help] [--quiet] [--with-keywords] [--with-plot] [--movies-only] [--actors NUMBER] [--reviews NUMBER] [--stats] [--debug] [--output FILE] [FILE...]

END
;
use XMLTV::TMDB;

my ($opt_help,
	$opt_output,
	$opt_quiet,
	$opt_stats,
	$opt_debug,
	$opt_movies_only,
	$opt_with_keywords,
	$opt_with_plot,
	$opt_num_actors,
	$opt_num_reviews,
	$opt_apikey,
	$opt_configure,
	$opt_configfile,
	);

GetOptions('help'			=> \$opt_help,
		'output=s'			=> \$opt_output,
		'quiet'				=> \$opt_quiet,
		'stats'				=> \$opt_stats,
		'debug:i'			=> \$opt_debug,
		'movies-only'		=> \$opt_movies_only,
		'with-keywords'		=> \$opt_with_keywords,
		'with-plot'			=> \$opt_with_plot,
		'actors=i'			=> \$opt_num_actors,
		'reviews=i'			=> \$opt_num_reviews,
		'apikey:s'			=> \$opt_apikey,
		'configure'			=> \$opt_configure,
		'config-file=s'		=> \$opt_configfile,
		) or usage(0);

usage(1) if $opt_help;

$opt_debug=1		 	if ( defined($opt_debug) && $opt_debug==0);
$opt_debug=0		 	if ( !defined($opt_debug) );

$opt_quiet=(defined($opt_quiet));
if ( !defined($opt_stats) ) {
	$opt_stats=!$opt_quiet;
}
else {
	$opt_stats=(defined($opt_stats));
}
$opt_debug=0  if $opt_quiet;


# undocumented option: for use by the test harness
my $test_opts = {};
my @_opts = qw/ updateDates updateTitles updateCategories updateCategoriesWithGenres updateKeywords updateURLs updateDirectors updateActors updatePresentors updateCommentators updateGuests updateStarRatings updateRatings updatePlot updateRuntime updateContentId updateImage numActors updateActorRole updateCastImage updateCastUrl getYearFromTitles removeYearFromTitles updateReviews numReviews /;
my %t_opts = map {$_ => 1 } @_opts;



# the --configure and --config-file options allow the storing of 
#	apikey, movies-only, actors, with-plot, with-keywords
# 
if (defined($opt_configure)) {
	# configure mode
	#   store apikey, num_actors

	# get configuration file name
	require XMLTV::Config_file;
	my $file = XMLTV::Config_file::filename( $opt_configfile, 'tv_tmdb', $opt_quiet );
     
	XMLTV::Config_file::check_no_overwrite( $file );

	# open configuration file. Assume UTF-8 encoding
	open(my $fh, ">:utf8", $file)
		or die "$0: can't open configuration file '$file': $!";
	print $fh "# config file for tv_tmdb # \n";

	# get apikey
	my $apikey = XMLTV::Ask::ask( 'Enter your TMDB api key' );
	chomp($apikey);
	#
	# write configuration file
	print $fh "apikey=$apikey\n";

	# get movies-only
	my $moviesonly = XMLTV::Ask::ask_boolean( 'Movies only?', 0 );
	print $fh "movies-only=$moviesonly\n";
	
	# get num_actors
	my $actors = XMLTV::Ask::ask( 'Enter number of actors from TMDB to add (default=3)' );
	chomp($actors);
	$actors = 3 if ( $actors eq '' || ($actors !~ /^\d+$/) );
	print $fh "actors=$actors\n";

	# get with-plot
	my $withplot = XMLTV::Ask::ask_boolean( 'Add plot from TMDB?', 0 );
	print $fh "with-plot=$withplot\n";

	# get with-keywords
	my $withkeywords = XMLTV::Ask::ask_boolean( 'Add keywords from TMDB?', 0 );
	print $fh "with-keywords=$withkeywords\n";
	
	# check for write errors
	close($fh)
		or die "$0: can't write to configuration file '$file': $!";
		
	print "Configuration completed ok \n";

	exit(0);
}

# load config file if we have one
if ( 1 ) {
	# read configuration
	#   read apikey, movies-only, actors, with-plot, with-keywords
	
	# get configuration file name
	require XMLTV::Config_file;
	my $file = XMLTV::Config_file::filename( $opt_configfile, 'tv_tmdb', $opt_quiet );

	# does file exist?
	if (-f -r $file) { 
	
		# read configuration file
		open(my $fh, "<:utf8", $file)
			or die "$0: can't open configuration file '$file': $!";

		# read config file
		while (<$fh>) {
			# comment removal, white space trimming and compressing
			s/\#.*//;
			s/^\s+//;
			s/\s+$//;
			s/\s+/ /g;
			s/\s+=\s+/=/;
			next unless length;	# skip empty lines

			# process a line
			my($k, $v) = /^(.*)=(.*)$/;

			# use config values unless given as opts
			#   commndline overrides config file
			if ( $k eq 'apikey' ) 			{ $opt_apikey 		 = $v unless defined $opt_apikey; }
			if ( $k eq 'actors' ) 			{ $opt_num_actors 	 = $v unless defined $opt_num_actors; }
			if ( $k eq 'movies-only' ) 		{ $opt_movies_only 	 = $v unless defined $opt_movies_only; }
			if ( $k eq 'with-plot' ) 		{ $opt_with_plot 	 = $v unless defined $opt_with_plot; }
			if ( $k eq 'with-keywords' ) 	{ $opt_with_keywords = $v unless defined $opt_with_keywords; }
			
			# is it one of the tester options?
			if ( exists($t_opts{$k}) ) {
				$test_opts->{$k} = $v;	
			}
		}  

		close($fh);
	}
}



# set some defaults
$opt_with_keywords=0 	if ( !defined($opt_with_keywords) );
$opt_with_plot=0	 	if ( !defined($opt_with_plot) );
$opt_num_actors=3		if ( !defined($opt_num_actors) );
$opt_num_reviews=0		if ( !defined($opt_num_reviews) );
$opt_movies_only=0 		if ( !defined($opt_movies_only) );



# check we have an api key
if ( !defined($opt_apikey) || $opt_apikey eq '' ) {	
	print STDERR <<END;
In order to use tv_tmdb, you need an API key from themoviedb.org ( https://www.themoviedb.org/ )
These are free for Personal use. You need to create a log-in with themoviedb.org
and then click on the API link on the Settings page 
( https://www.themoviedb.org/settings/api )
END
	exit(1)
}

# create package options from user input
my $tmdb_opts = {'apikey' 			=> $opt_apikey,
				 'verbose'		 	=> $opt_debug,
				 'updateKeywords'  	=> $opt_with_keywords,
				 'updatePlot'	  	=> $opt_with_plot,
				 'numActors'	  	=> $opt_num_actors,
				 'numReviews'	  	=> $opt_num_reviews,
				 'moviesonly'		=> $opt_movies_only,
				 };

# merge in the test params
my $tmdb_params = { %$tmdb_opts, %$test_opts };

# invoke a TMDB object
my $tmdb=new XMLTV::TMDB( %$tmdb_params );


# check the API works (e.g. apikey is valid)
if ( my $errline=$tmdb->sanityCheckDatabase() ) {
	print STDERR "$errline";
	exit(1);
}

# instantiate an API object
if ( !$tmdb->openMovieIndex() ) {
	print STDERR "tv_tmdb: open api client failed\n";
	exit(1);
}

# open the output file
my %w_args = ();
if (defined $opt_output) {
	my $fh = new IO::File ">$opt_output";
	die "cannot write to $opt_output\n" if not $fh;
	%w_args = (OUTPUT => $fh);
}

my $numberOfSeenChannels=0;


#------------------------------------------------------------------------
# callback function definitions, and file processing

my $w;
my $encoding;   # store encoding of input file

sub encoding_cb( $ ) {
	die if defined $w;
	$encoding = shift;	# callback returns the file's encoding
	$w = new XMLTV::Writer(%w_args, encoding => $encoding);
}

sub credits_cb( $ ) {
	$w->start(shift);
}

my %seen_ch;
sub channel_cb( $ ) {
	my $c = shift;
	my $id = $c->{id};
	$Data::Dumper::Sortkeys = 1; # ensure consistent order of dumped hash
	if (not defined $seen_ch{$id}) {
		$w->write_channel($c);
		$seen_ch{$id} = $c;
		$numberOfSeenChannels++;
	}
	elsif (Dumper($seen_ch{$id}) eq Dumper($c)) {
		# They're identical, okay.
	}
	else {
		warn "channel $id may differ between two files, "
		  . "picking one arbitrarily\n";
	}
}

sub programme_cb( $ ) {
	my $prog=shift;

	# The data from TMDB is encoded as utf-8. 
	#   The xml file may be different (e.g. iso-8859-1).

	my $orig_prog = $prog;
	# decode the incoming programme
	$prog = XMLTV::Data::Recursive::Encode->decode($encoding, $prog);

	# augmentProgram will now add tmdb data as utf-8
	my $nprog = $tmdb->augmentProgram($prog, $opt_movies_only);
	
	if ( $nprog ) {
		# re-code the modified programme back to original encoding
		$nprog = XMLTV::Data::Recursive::Encode->encode($encoding, $nprog);
		$prog = $nprog;
	}
	else {
		$prog = $orig_prog;
	}

	$w->write_programme($prog);
}

@ARGV = ('-') if not @ARGV;

XMLTV::parsefiles_callback( \&encoding_cb, \&credits_cb,
							\&channel_cb, \&programme_cb,
							@ARGV );

if ( $w ) {			# we only get a Writer if the encoding callback gets called
	$w->end();
}
#------------------------------------------------------------------------


# print some stats
if ( $opt_stats ) {
	print STDERR $tmdb->getStatsLines($numberOfSeenChannels);
}

# destroy the API object
$tmdb->closeMovieIndex();

exit(0);
